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ABSTRACT 

^  A  computer  algorithm  was  developed  which  successfully 
locates  and  identifies  human  face(s)  that  are  present  in  a 
digitized  computer  image.  In  the  process  of  finding  the 
facial  image,  the  algorithm  simultaneously  determines  the 
boundary  locations  of  the  sides  of  the  eyes,  the  center  of 
the  face,  the  tip  of  the  nose  and  the  vertical  center  of  the 
mouth. 

Detection  of  facial  Images  is  based  on  analysis  a 
digitized  scene  for  the  presence  of  characteristic  facial 
feature  signatures  for  the  eyes,  nose  and  mouth.  These 
signatures  are  generated  by  the  application  of  a  “center  of 
mass"  calculation  to  each  pixel  row  and  column  for  various 
sub-sections  of  the  digitized  scene.  The  presence  of  a  face 
is  confirmed,  and  its  feature  locations  are  determined  based 
on  the  presence  and  location  of  the  local  maxima  and  minima 
which  occur  in  these  curvilinear  signatures. 

Once  the  face(s)  have  been  located,  individual  face  ~ 
recognition  is  performed  by  calculating  the  "gestalt  point”' 
for  six  different  regions  of  the  face.  The  gestalt  is  a 
location,  in  the  2-dimensional  facial  region,  which 
corresponds  closely  to  the  center  of  mass  of  the  pixel 
intensity  distribution  for  that  region.  Identification  of 


viii 


the  unknown  face  is  performed  by  comparing  its  gestalts  with 
those  for  known  faces,  using  a  distance  metric. 

The  algorithm's  face  locator  function  was  tested  on  139 
facial  Images  representing  thirty  different  subjects.  The 
algorithm  successfully  located  and  bounded  the  Internal 
region  of  the  face  in  94%  of  the  cases.  Further  tests, 
against  a  limited  number  of  arbitrary  backgrounds,  Indicate 
that  the  algorithm  is  highly  specific  for  faces. 

The  face  recognition  portion  of  the  algorithm  was  tested 
against  a  database  of  20  different  subjects.  In  two  trial 
runs,  using  the  same  20  person  database,  18  of  the  20 
subjects  were  in  the  top  three  candidates  selected.  In  one 
trial  run,  the  algorithm  successfully  identified  the  unknown 
Individual  (selected  the  proper  individual  as  the  number  one 
candidate)  60%  of  the  time.  In  the  other  trial  run  the 
proper  candidate  was  identified  50%  of  the  time.  Recognition 
was  based  solely  on  analysis  of  the  internal  facial 


f ea  tures . 


I.  Introduction 


Background 

In  his  thesis  "Performance  of  a  Face  Recognition 
Machine  Using  Cortical  Thought  Theory”(22;  Appendix  A), 

Capt.  Robert  L.  Russel  successfully  developed  and  tested  a 
Face  Recognition  Machine.  The  Face  Recognition  Machine,  when 
given  a  digitized  picture  of  a  human  face,  is  capable  of 
identifying  the  person  to  whom  the  face  belongs. 
Identification  is  accomplished  by  first  "training"  the 
machine  to  recognize  an  individual's  face.  Another  picture 
(one  on  which  the  machine  has  not  been  trained)  is  then 
.nput  to  the  machine,  and  a  program  is  run  to  identify  the 
individual.  Captain  Russel  tested  the  machine  using  a  data 
base  of  20  different  individuals.  Test  results  indicated 
that  the  machine  was  able  to  successfully  identify  the  test 
subject  90Z  of  the  time.  In  the  remaining  10Z,  the  test 
subjects  were  high  on  the  machines  candidate  list  (second  or 
third  in  a  list  of  20). 

To  train  the  machine,  a  series  of  pictures  of  an 
individual's  face  are  taken  (with  a  video  camera)  and  stored 
in  a  digital  data  base  on  a  general  purpose  computer.  The 
digitized  facial  pattern  data  is  then  processed  according  to 
the  "Cortical  Thought  Theory"  (CTT)  model  of  the  human 
brain,  d  -eloped  by  Captain  Richard  Routh  (21). 

Cor«.i.cal  Thought  Theory  involves  the  use  of  an 
algorithm  which  calculates  the  "gestalt”  of  a  given 


pattern.  According  to  this  theory,  the  "gestalt"  represents 
the  essence  or  “single  characterization"  uniquely  assigned 
to  an  entity  (in  this  case  a  2-dimensional  image)  by  the 
human  brain.  Ma thma tically ,  the  gestalt  is  calculated  using 
a  2-dimensional  discrete  transform  which  operates  on  the 
pixels  (individual  picture  elements)  of  the  digitized 
picture.  The  result,  for  any  given  picture,  is  a  set  of 
numbers  (cardinality  of  2)  which  represent  a  location  on  a 
2-dimensional  grid.  This  location  varies  according  to  the 
pattern  of  the  image  being  transformed.  In  general,  the 
gestalt  location  seems  to  closely  correspond  to  the  "center 
of  mass"  of  the  pixel  intensity  values  (with  dark  pixels 
having  the  higher  mass). 

When  initially  tested  using  pictures  of  the  entire 
face,  the  system's  ability  to  discriminate  between  different 
faces  (using  the  process  outlined  above)  was  poor.  A 
technique  referred  to  as  "windowing"  was  then  developed  to 
Improve  the  performance  of  the  system.  In  the  windowing 
technique,  the  face  is  automatically  divided  into 
characteristic  regions  and  then  each  region  is  processed 
according  to  the  CTT  model.  Figure  1-1  illustrates  the 
windows  used  and  the  results  of  the  gestalt  calculation  for 
each  window.  The  numbers  in  parenthesis  (above  each 
respective  window)  correspond  to  the  "gestalt"  or  (X,Y) 
point  of  the  center  of  mass. 


Thus,  the  training  process  involves  calculating  a  set 
of  gestalt  points  (one  for  each  window)  for  each  of  several 


pictures  of  an  Individual,  and  storing  the  gestalt  points 
obtained  for  that  Individual  In  the  computer  memory.  When 
Identifying  an  unknown  person,  the  machine  calculates  the 
unknown  person's  gestalt  points  and  compares  these  values 
with  those  stored  from  previous  training  sessions.  The 
person  corresponding  to  the  gestalt  points  closest  (overall) 
to  that  of  the  unknown  person  Is  selected  as  the  prime 
Candida  te . 

Problem 

In  Russel's  Implementation,  the  machine  often  requires 
operator  Intervention  In  setting  up  the  proper  window 
boundaries.  These  boundaries  are  Indicated  by  the  horizontal 
and  vertical  lines  in  the  picture  located  In  the  upper  right 
hand  corner  of  figure  1-1.  The  operator  Intervention  is 
subjective  in  nature  (such  as  having  to  determine  where  the 
bottom  of  the  eyes  are  located)  and  the  machine's 
performance  is  sensitive  to  It.  As  the  data  base  continues 
to  grow,  It  Is  expected  that  this  sensitivity  will  become 
more  critical. 

Por  this  thesis  effort,  the  author  proposes  to  remove 
the  machine  reliance  on  operator  intervention.  This  requires 
that  the  machine  be  able  to  locate  a  face  or  faces  In  any 
given  picture  and  perform  the  required  windowing  in  a 
completely  autonomous  fashion  . 

Scope 

This  thesis  effort  is  concerned  with  enabling  the 


machine  to  autonomously  locate  and  window  the  human  face 


from  a  digital  image,  locate  the  major  features  of  that 
face,  and  create  various  sub-windows  of  that  face  for 
further  face  recognition  processing.  The  system  is  designed 
to  process  a  picture  with  more  than  one  face  in  it.  No 
effort  will  be  made  to  speed  up  the  gestalt  calculation 
process  (as  developed  by  Captain  Russel)  or  compensate  for 
variations  in  face  orientation  with  respect  to  the  camera. 
Assump  tlons 

In  any  given  picture,  the  subject(s)  are  looking 
squarely  at  the  camera  (there  is  no  tilt  or  rotation  of  the 
head).  Subjects  faces  are  not  obscured  and  lighting  is 
consistent.  The  subject  is  not  wearing  glasses  and  the 
facial  area  is  free  of  unusual  markings.  The  subject  is  not 
moving  and  has  a  relatively  relaxed  expression  (the  face  is 
not  being  deliberately  contorted).  Four  pictures  for  each 
subject  are  sufficient  to  characterize  a  person.  The  basic 
CTT  algorithm  used  in  the  original  machine  is  valid. 

S  tandards 

Test  results  must  meet  the  same  criteria  as  set  out  in 
Captain  Russel's  thesis: 

1.  System  must  demonstrate  "human  like" 
classification  of  human  face  images. 

2.  System  performance  must  be  repeatable. 

3.  Recognition  performance  must  be  as  good  as  that 
obtained  by  Captain  Russel  (90Z  successful). 

4.  All  critical  components  and  assumptions  remain 


consistent  with  CTT. 


In  addition,  the  system  must  never  require  operator 
intervention  in  locating  and  windowing  facial  patterns  in  a 
picture.  Routine  operator  control  such  as  initiating  and 
terminating  the  process  will  still  be  required.  The  system 
must  also  be  able  to  process  pictures  with  multiple  (at 
least  two)  faces  in  them. 

General  Approach 

As  stated  earlier,  the  machine  has  demonstrated  a 
sensitivity  to  subjective  operator  Intervention.  It  would  be 
useful,  both  theoretically  and  practically,  if  the  machine 
could  be  made  to  autonomously  locate  and  bound  a  face,  as 
well  as  perform  any  necessary  windowing.  To  accomplish  this 
task,  a  software  routine  was  developed  which  locates  the 
facial  image(s)  by  performing  scene  analysis  of  the 
digitized  picture.  The  basic  algorithm  for  this  technique  is 
a  facial  signature  search  routine.  The  key  signature  used  is 
one  which  corresponds  to  the  human  eyes.  Once  a  set  of 
possible  eyes  have  been  located,  the  software  routine  then 
examines  the  image  for  a  nose  and  mouth  signature.  If  all 
signatures  are  identified  and  are  located  in  acceptable 
positions,  then  the  presence  of  a  face  is  confirmed.  This 
algorithm  enables  the  machine  to  locate  a  face(s)  anywhere 
in  a  picture,  since  it  is  shift  invariant.  Unlike  the 
original  machine,  it  i3  necessary  to  work  within  the  "inner” 
boundaries  of  the  face.  This  domain  effectively  excludes 
access  to  information  in  the  hair,  ears  and  bottom  of  chin. 
This  constraint  is  necessary  to  allow  for  processing  of 


Images  in  which  the  background  is  not  required  to  be  noise 
free. 

Once  the  faces  have  been  located  and  bounded,  each  face 
is  processed  separately.  Contrast  enhancement  of  each  facial 
image  is  implemented  using  a  slightly  modified  version  of 
Russel's  original  histogram  stretch  routine.  Ideally,  once 
the  proper  contrast  enhancement  has  taken  place,  all  and 
only  that  information  which  makes  up  the  essence  of  the  face 
is  displayed.  The  windowing  process  is  then  performed  using 
the  original  software  developed  by  Captain  Russel  (modified 
as  necessary).  In  general,  the  original  software  is  utilized 
(for  example  the  data  base  software)  to  the  maximum  degree 
possible . 

Recognition  testing  will  be  performed  using  a  minimum 
of  20  different  subjects.  For  each  subject,  the  system  will 
have  been  trained  with  a  minimum  of  4  different  pictures. 

One  picture  will  be  held  in  reserve  for  testing  (not  used 
for  training)*.  Results  will  be  compared  with  the  criteria 
set  forth  in  the  "Standards"  section  above  and  with  Captain 
Russel's  original  results.  Several  pictures,  with  one  or 
more  subjects  in  them,  will  also  be  tested  to  verify  the 
AFRM's  ability  to  locate  faces  in  a  noisy  background. 


Materials  and  Egu i p me n t 


The  equipment  necessary  to  develop  and  test  the 
required  computer  software  is  as  follows: 

Data  General  Eclipse  S/2S0  Computer  System 
Data  General  Nova  2  Computer  System 
Octek  2000  Video  Processing  Board 
Dage  650  Video  Camera  (F-Stop  and  Zoom  control) 
Panasonic  WV-5490  Monochrome  Monitor 
Tektronix  4632  Video  Hard  Copy  Unit. 

All  equipment  is  available  and  located  in  the  AFIT 
Signal  Processing  Lab. 

Thesis  S  true  ture 

The  first  chapter  is  Intended  to  serve  as  an 
Introduction  to  the  Face  Recognition  Machine  (FRM).  This 
chapter  contains  a  brief  review  of  the  underlying  theory  and 
performance  capabilities  of  the  FRM.  It  also  outlines  the 
problem  which  this  thesis  addresses  and  the  approach  taken 
to  solve  this  problem. 

The  purpose  of  chapter  two  is  to  review  the  literature 
concerning  face  recognition.  After  a  brief  review  of  the 
general  literature  on  automated  facial  image  processing,  the 
reader  is  acquainted  with  several  previous  research  efforts 
which  closely  parallel  the  work  done  in  this  thesis  effort. 

In  chapter  three  the  reader  will  find  a  description  of 
the  design  and  development  of  the  face  finding  and  face 
recognition  algorithms.  Since  the  algorithm  used  for  face 


recognition  Is  largely  the  same  algorithm  that  was  used  in 
Russel's  work  (22),  discussion  on  this  topic  is  limited  to 
any  necessary  modifications.  The  emphasis  of  this  and  the 
next  chapter  is  on  the  face  finding  algorithm  which  is 
unique  to  this  thesis. 

Chapter  four  explains  the  system  implementation  details. 
The  reader  is  first  given  a  brief  desciption  of  the  hardware 
and  software  used  in  the  development  and  test  of  this 
system.  He  is  then  provided  with  an  in  depth  look  at  how  the 
Autonomous  Face  Recognition  Machine  ( AFRM )  is  Implemented  on 
the  Eclipse/Nova  Data  General  Computer  system. 

Chapter  five  contains  descriptions  and  results  of 
testing  designed  to  evaluate  the  AFRM's  ability  to  find  and 
identify  individual  faces. 

Conclusions  based  on  the  results  of  tests,  and 
recommendations  for  future  development  are  described  in 
sixth  (last)  chapter  of  this  thesis. 


I 1 .  Background  of  Facial  Image  Processing 


i 

General  Aspects  J 

"No  other  object  In  the  visual  world  is  quite  so  j 

important  to  us  as  the  human  face"  (6:1).  The  previous  quote  j 

i 

is  well  substantiated  by  the  existence  of  an  extensive  | 

literature  base  on  the  general  topic  of  recognition  and 

perception  of  human  faces.  In  the  same  text  (6),  there  are 

several  hundred  references  concerning  this  topic.  Another  I 

( 

i 

author  (2),  has  published  a  separate  bibliography  on  topics 

concerned  with  only  face  recognition.  The  range  of  various  ! 

aspects  concerning  this  topic  is  also  quite  remarkable.  { 

Studies  concerning  the  legal,  developmental,  psychological, 

emotional  and  other  aspects  of  face  perception  are  readily 

available . 

The  reasons  for  such  a  extensive  interest  in  this  topic 
are  many.  Obviously,  almost  everybody  is  interested  in  his  j 

own  ability  to  recognize  and  correctly  interpret  the 
information  contained  in  facial  Images.  This  is  commonly 

attributed  to  the  fact  that  man  is  a  highly  social  animal.  . 

Without  the  aforementioned  abilities,  the  social 

interactions  of  any  individual  are  severely  handicapped.  For 

those  whose  interest  lie  in  how  we  percieve  faces,  the 

reasons  are  sometimes  not  quite  so  obvious.  In  the  case  of 

this  thesis,  the  motivation  is  to  see  if  a  machine  can  be 

produced  which  finds  and  identifies  human  faces. 


In  attempting  to  implement  the  pattern  recognition 
capabilities  of  the  human  brain  on  a  machine,  one  quickly 


cones  to  the  realization  that  the  brain  is  an  astounding 
pattern  recognition  machine.  Many  theories  have  been  put 
forth  concerning  the  mechanism  by  which  the  brain  performs 
various  (e.g.  visual)  pattern  recognition  tasks.  And  none 
appear  to  have  been  more  than,  at  best,  partially  successful 
in  mimicking  the  brain's  performance.  However,  the  bottom 
line  is  still  that  the  state  of  the  art  in  machine  pattern 
recognition  has  not  even  come  close  to  matching  the  speed 
and  versatility  with  which  the  brain  (not  necessarily  just 
human)  performs  pattern  recognition. 

In  the  special  case  of  "facial"  pattern  recognition, 
there  are  a  wealth  of  studies  available.  These  studies  are 
concerned  with  such  aspects  as  spatial  frequency  analysis 
( 8 ; 1 9 ) ,  effects  of  feature  displacement  (10),  effects  of 
changes  in  visible  area  (13),  and  eye  movement  strategies 
(25)  associated  with  facial  images.  However,  in  the  case  of 
machine  analysis  of  visual  images,  both  for  the  presence  and 
identification  of  facial  images,  relatively  few  theories  or 
models  have  been  proposed  and  tested.  The  next  sections  of 
this  chapter  will  acquaint  the  reader  with  some  of  the 
previous  work  that  has  been  accomplished  in  this  area.  The 
discussion  in  the  following  sections  has  been  divided 
between  finding  facial  Images  and  identifying  to  whom  those 
facial  Images  belong.  The  result  of  this  approach  is  that 
the  parts  of  a  researcher's  work  which  are  pertinent  to 
locating  faces  appears  in  one  section,  and  the  parts 
pertaining  to  facial  recognition  appear  in  the  following 


Figure  2-1  Example  of  a  Line  Extracted  Image  (24:243) 

per  unit  area),  larger  windows  (such  as  9  x  9)  might  have  to 
be  used  since  the  degree  of  intensity  change  across  a 
smaller  window  may  be  too  gradual  to  allow  accurate  edge 
detection.  Through  a  fairly  complex  technique  of 
thresholding,  basic  line  elements  are  then  established  and 
extended  to  yield  the  final  line  extracted  image.  One 
example  of  the  results  of  such  a  process  is  illustrated  in 
figure  2-1. 

The  authors  admit  that  this  line  extraction  is  not 
adequate  for  a  number  of  different  types  of  photographs. 
Exactly  what  types  of  photographs  this  technique  was  not 
suitable  for  they  did  not  indicate. 


The  next  phase  In  the  process  is  to  locate  the  face  or 
faces  in  the  line  extracted  inage  using  a  pattern  matching 
technique.  Figure  2-2  illustrates  the  templates  used  in  the 
pattern  matching  process. 


Figure  2-2  Template  Used  for  a  Human  Face  (24:244) 

After  the  appropriate  adjustment  (apparently  by  a  human 
operator)  of  the  size  of  the  templates  to  match  the  size  of 
the  facial  Images  in  the  line  extracted  photo,  the  image  is 
scanned  to  find  a  match  with  templates  4  and  5  (the  facial 
outline).  Once  this  is  achieved,  the  image  is  evaluated  to 
see  how  well  it  matches  with  template  6.  Template  6 
describes  regions  in  which  no  or  relatively  few  lines  should 
be  present.  Finally,  the  image  is  evaluated  to  see  if  there 
is  a  match  between  it  and  templates  1,  2  and  3.  This  last 
requirement  is  very  flexible  In  that  If  any  one  of  the 
templates  (of  1,  2  or  3)  are  matched  strongly,  or  if  two  or 
three  of  the  set  are  matched  mildly,  the  result  is  still 
considered  a  match.  When  all  of  the  aforementioned  criteria 
are  met,  the  presence  of  a  face  is  considered  to  be 
confirmed  at  the  location  of  the  central  point  of  the 


Although  the  authors  claim  this  technique  was  successful 
in  finding  faces  in  many  different  pictures,  they  gave  no 
figures  on  just  how  successful  they  were.  As  might  be 
expected  in  any  technique  which  relies  on  grey  level 
intensity  analysis,  the  authors  affirmed  the  s u s ce p tab i 1 i ty 
of  this  technique  to  variations  in  lighting. 

In  another  effort  related  to  locating  a  face  in  a  visual 
Image,  Bromley  (4:17)  used  a  facial  signature  technique  to 
locate  the  vertical  center  and  sides  of  the  face.  The 
primary  emphasis  of  Bromley's  thesis  was  to  develop  the 
capability  to  automatically  locate  various  facial  features, 
once  the  location  of  the  facial  image  was  known.  However,  as 
part  of  the  initial  processing,  her  algorithm  required  the 
capability  to  locate  a  face  in  a  somewhat  benign 
environment.  The  limited  constraints  under  which  the  search 
for  the  face  was  conducted  are  evident  by  the  use  of  police 
mug  file  images  which  were  roughly  centered  on  the  face  in 
question  and  in  which  the  facial  image  constituted  a  large 
part  of  the  entire  Image.  Figure  2-3  illustrates  an  example 
of  the  images  used. 

Also  shown  in  figure  2-3  (bottom  part)  is  an  example  of 
the  results  of  applying  a  "row  signature"  technique  to  the 
mug  file  digital  image.  The  signature  is  generated  by  simply 
summing  the  rows  of  pixel  Intensity  values  for  each  column 
and  plotting  the  results.  A  characteristic  peak  almost 
always  occurs  in  coincidence  with  the  center  of  the  face, 


Figure  2-3  Typical  Mug  File  Inage 


and  characteristic  "vallies”  occur  at  the  locations  of  the 
sides  of  the  face.  Broaley  reported  (4:47)  fairly  consistent 
results  using  this  technique  in  a  limited  number  of  cases. 
Her  algorithm  relies  on  a  fairly  pristine  background  and 
certain  physical  attributes  of  the  subject  face  being 
analyzed.  Still,  it  does  suggest  a  possible  technique  for 
locating  facial  images  under  much  less  constrained 
conditions.  In  fact,  the  basic  concept  used  by  Bromley  is 
remarkably  slmiliar  to  the  technique  used  to  generate  the 
eye  signature  as  discussed  in  the  next  chapter. 

A  third  technique  related  to  locating  facial  images  in  a 
digitized  image  is  based  on  image  correlation.  In  an 
experiment  conducted  by  Baron  at  the  University  of  Iowa 
(1:145-151),  a  system  was  devised  to  perform  automatic  face 
recognition  from  digitized  images  of  faces.  As  in  Bromley's 
case,  the  main  thrust  of  the  research  was  not  to  investigate 
automatic  facial  image  location,  but  rather  automatic  face 
recognition.  Unlike  the  mug  file  images  used  by  Bromley,  the 
pictures  used  in  this  case  did  contain  background  clutter 
and  were  taken  under  much  less  stringent  conditions.  The 
pictures  were  full  face  images  as  illustrated  in  figure  2-4. 
Again  the  author  was  faced  with  the  task  of  locating  the 
face  (or  at  least  part  of  the  face)  as  an  initial  step  of 
the  processing.  The  approach  taken  was  to  search  the  image 
for  the  subjects  eyes  using  "eye  templates"  in  a  correlation 
algorithm  as  illustrated  in  figure  2-4. 
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Figure  2-4  Correlation  Procedure  for  Locating  Eyes  (1:146) 


The  author  indicates  that  the  purpose  of  locating  the 
eyes  is  to  facilitate  standardization  of  the  size  of  the 
facial  laage  based  on  the  distance  between  the  eyes.  Once 
the  size  of  the  facial  iaage  has  been  standardized,  further 
correlation  ■echanisms  are  used  to  perfora  recognition  as 
discussed  in  the  next  section  of  this  chapter.  Baron  reports 
that  the  technique  enjoyed  considerable  success  in  both 
locating  the  eyes  and  perforalng  recognition.  This  technique 
could  be  extended  to  searching  for  a  face  in  an  laage  in 
which  the  face  occupied  a  much  smaller  portion  of  the 
overall  laage.  This  would  constitute  a  very  large 
coaputation  since  various  teaplate  sizes  as  well  as  types 


would  have  to  be  correlated  with  every  location  in  the 
image . 

Each  of  these  three  approaches  is  uniquely  different 
from  the  other.  The  only  one  of  the  three  which  addresses 
the  more  complex  problem  of  extracting  facial  images  from 
noisy  backgrounds  is  Sakai's  approach.  However,  Sakai's 
approach  is  reliant  on  a  human  operator  to  adjust  the  size 
of  the  masks  to  the  size  of  the  facial  Images  in  the  test 
picture.  In  an  autonomous  machine  this  would  mean  that  the 
search  would  have  to  be  conducted  over  a  range  of  mask 
sizes.  This  approach,  like  Baron's,  would  be  a  very  time 
consuming  task.  The  problem  of  locating  a  face  in  Bromley's 
work  was  the  least  complex  of  the  three.  The  major 
distinction  between  her  work  and  the  other  two  is  that  she 
first  transforms  the  facial  image  into  a  simple  signature 
and  then  determines  the  location  of  the  face.  The  primary 
advantage  of  such  a  technique  is  speed. 

Recognl tlon  of  Individual  Faces 

One  of  the  basic  assumptions  of  this  thesis  is  that  the 
CTT  based  recognition  scheme  designed  by  Russel  (22)  is 
valid.  To  apply  that  theory  and  design,  it  is  necessary  to 
window  portions  of  the  face  based  on  consistent  facial 
landmarks  or  features.  Automatic  facial  feature  location 
then  becomes  an  absolute  prerequisite  to  applying  the  CTT 
based  model.  The  following  is  a  brief  review  of  previous 
work  done  on  locating  facial  features  for  the  purposes  of 
classifying  and/or  recognizing  individual  faces. 


There  have  been  several  attempts  made  at  automated  and 
semi-au toma ted  facial  feature  measurement,  usually  with  the 
goal  of  either  recognizing  a  face  or  at  least  classifying 
the  type  of  face.  Semi -au toma ted  facial  feature  measurement 
typically  involves  a  human  operator  working  interactively 
with  a  machine  (computer)  to  analyze  facial  images.  It  is 
usually  the  human  operator  who  performs  the  pattern 
recognition  task  and  the  computer  that  does  the  bookkeeping 
and  classification.  Some  of  the  major  early  contributors  in 
semi-automated  measurement  were  Bledsoe  (3)  and  Harmon  (12). 
The  results  of  their  work  have  been  summarized  in  Captain 
Russel's  thesis  (22)  and  will  not  be  reiterated  here. 

In  the  area  of  fully  automated  machine  classification 
and/or  recognition  of  human  faces,  Harmon  (11)  again  was  one 
of  the  key  contributors.  Unfortunately  the  work  performed  by 
Harmon  was  concerned  with  human  face  "profiles”  and  to  a 
large  degree  is  not  applicable  to  this  thesis  (since  this 
effort  is  concerned  solely  with  frontal  face  views). 

Another  researcher,  Bromley  (4),  developed  an  automatic 
feature  locating  algorithm  for  use  in  reducing  the  number  of 
mug  file  pictures  that  had  to  be  reviewed  by  a  witness  to  a 
crime.  The  type  of  features  located  by  her  algorithm  are 
indicated  in  figure  2-5. 

The  technique  used  by  Bromley  to  locate  the  vertical 
features  (LF  or  left  face.NL  or  nose  line,  and  IRF  or  right 
face)  was  described  earlier  in  the  section  on  locating 
faces.  The  horizontal  feature  locations  (for  example  the 


eyes  or  LE )  were  acquired  by  analyzing  a  curve  generated  by 
taking  a  finite  difference  derivative  of  the  pixel  values 
along  the  vertical  nose  line.  This  information,  when  used  in 
conjunction  with  estimates  of  where  the  appropriate  features 
should  be,  allowed  fairly  accurate  determination  of  the 
desired  feature  locations.  For  instance,  in  determining  the 
top  of  head  boundary  location,  one  would  search  down  (along 
the  nose  line)  until  the  first  maximum  negative  derivative 
value  was  found.  Assuming  the  top  of  the  head  is  darker  than 
the  background  (a  safe  assumption  in  mugshots),  and  that  the 
background  is  relatively  noise  free,  the  top  of  the  head 
would  then  correspond  to  the  location  of  the  maximum 
negative  derivative.  Since  the  center,  sides  and  top  of  head 
)  are  now  known,  reasonable  estimates  can  be  made  concerning 

the  location  of  the  rest  of  the  desired  features.  Bromley 
apparently  had  reasonably  good  success  with  this  technique. 
Unfortunately,  development  of  an  algorithm  which  could  use 
the  feature  location  information  to  classify  facial  Images 
was  not  a  part  of  her  work.  Thus  no  facial 
clas s 1 f ica t ion/ 1  den 1 1 f i ca 1 1  on  performance  statistics  are 


available . 


Id  1965  (18:78),  Vander  Lugt  demonstrated  how 
Ids taotaneous  recognition  of  faces  could  be  accomplished 
using  the  optical  computer  Illustrated  in  figure  2*6.  Such 
device  performs  an  ’’optical"  correlation  of  the  Input  image 
with  the  test  image.  The  result  is  a  bright  spot  on  the 
output  pattern  which  corresponds  with  the  location  of  an 
image  of  slmillar  size  and  spatial  arrangement  that  is  on 
the  test  image. 


Figure  2-6  Optical  Computer  for  Recognizing  Faces  (1:139) 

Baron  (1:140)  proposed  a  model  for  the  way  the  brain 
processes  information  based  on  the  same  concept  of 
correlation.  As  mentioned  previously,  he  implemented  and 
tested  an  automatic  face  recognition  system  based  on  this 


theory.  Baroo  et  al  used  an  IBM  System  360  Model  65  computer 
In  conjunction  with  a  Spatial  Data  806  Computer  Eye 
digitizer.  The  Initial  Images  were  512  by  480  pixels  which 
were  subsequently  reduced  to  128  by  120  pixels  through 
averaging  of  4  by  4  squares  of  the  original  Image.  After 
locating  the  subject's  eyes  (described  previously)  and 
adjusting  the  size  of  the  facial  image  to  a  standard  size 
based  on  the  distance  between  the  eyes,  the  Image  was 
reduced  even  further  to  one  of  15  by  16  pixels.  Further 
looks  at  various  regions  of  the  face  were  also  acquired  and 
reduced  to  15  by  16  pixels  as  illustrated  in  figure  2-*7. 

The  only  instance  where  the  system  was  not  performing  In 
a  fully  autonomous  mode  was  In  the  determination  of  which 
various  sub-looks  (mouth,  chin  and  others)  were  to  be  stored 
In  Its  data  base.  A  human  user  was  required  to  supply  the 
computer  with  a  list  of  features  of  Interest  and  apparently 
had  to  point  out  the  exact  location  of  those  features  during 
the  Initial  training  session.  Additional  pictures  of  the 
same  subject  were  then  added  to  the  data  base  using  the 
control  Information  Initially  supplied  by  the  user.  Each 
face  was  represented  by  up  to  20  templates  like  the  ones 
Indicated  in  figure  2-7.  The  system  was  tested  using  a  data 
base  of  42  individuals.  Recognition  accuracy  was  100Z.  In 
addition,  a  set  of  over  150  faces  (which  included  the 
original  42  faces)  were  tested,  with  the  system  rejecting 
the  faces  not  in  the  data  base  and  recognizing  all  of  the 


Figure  2-7  Baron's  Stored  Representations  of  a  Face  Image 

(1:147) 


faces  that  were  in  the  data  base.  The  system  performance 
remained  unaffected  by  face  rotations  up  to  20  degrees, 
which  interestingly  is  consistent  with  some  recent  findings 
(17:340)  concerning  the  performance  of  what  appear  to  be 
face  specific  neurones  in  the  rhesus  monkey  brain.  Finally, 
it  was  found  that  unlike  the  performance  of  human  face 
recognition,  the  system  could  not  adequately  handle  large 
changes  in  the  size  of  the  image. 

Although  other  efforts  have  been  conducted  (14;  15;  23), 
the  only  remaining  effort  in  face  recognition  of  consequence 
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Co  this  thesis  is  Chat  coaducted  by  Russel  (22).  Since  Che 
theory,  design  and  performance  are  covered  elsewhere 
(Appendix  A  and  chapter  1)  the  reader  is  referred  to  those 
areas  for  any  desired  further  background  information. 


This  chapter  documents  the  top  level  design  and  the 
development  and  design  decisions  that  were  made  in 
developing  the  Autonomous  Face  Recognition  Machine  (AFRM). 
The  chapter  is  divided  into  two  main  sections.  The  first 
section  concerns  development  and  design  of  the  algorithm 
which  determines  the  presence  and  location  of  facial  images 
within  a  digital  image.  The  second  section  examines  the 
algorithm  used  in  performing  face  recognition.  The  decision 
to  divide  the  development  along  these  lines  is  not  totally 
arbitrary.  Experts  in  the  field  of  face  recognition  seem  to 
agree  that  these  two  tasks  are  indeed  separate  and  serial 
(5:321).  The  phrase  "separate  and  serial"  refers  to  the  fact 
that  a  face  must  be  first  identified  as  a  face  before 
further  specialized  processing  (facial  recognition)  can  be 
per f ormed . 

The  Face  Finder 

As  mentioned  in  the  introduction,  the  approach  used  to 
find  a  face  in  a  visual  image  is  dependent  on  the  use  of 
certain  facial  area  signatures.  The  general  nature  of  these 
signatures  is  such  that  they  are  almost  always  present  when 
a  face  is  in  the  image  and  are  sometimes  present  when  a  face 
is  not  in  the  image.  An  algorithm  was  developed  which 
methodically  searches  for  these  face  specific  signatures  in 
some  test  image.  Once  a  facial  image  has  been  found,  the 
system  stores  the  necessary  data  for  later  retrieval  by  the 
face  recognition  algorithm.  The  following  sub-sections 


provide  a  detailed  explanation  concerning  what  these 
signatures  are,  how  they  were  developed,  and  how  they  are 
used  in  locating  facial  images. 

A  brief  review  of  the  literature  reveals  that  the 
number  of  ways  of  segmenting  an  image  are  limited  only  by 
the  imagination  and  tools  of  the  researcher.  Rather  than 
attempting  to  evaluate  all  of  these  techniques  and  then  pick 
the  best  possible  one  for  use  in  locating  faces,  the 
decision  was  made  to  use  the  gestalt  calculation  (or  some 
variation  of  it)  as  was  originally  introduced  in  Russel's 
thesis  (22).  Since  the  gestalt  calculation  has  a 
demonstrated  capability  for  distinguishing  between  faces, 
then  it  should  be  possible  to  apply  that  same  calculation  to 
locating  faces.  Any  calculation  capable  of  detecting  the 
nuances  between  individual  faces  ought  to  be  capable  of 
distinguishing  faces  from  other  classes  of  objects. 
Additionally,  if  the  brain  is  capable  of  doing  this 
calculation  in  the  performance  of  recognition,  then  it  is 
also  possible  that  a  slmiliar  calculation  is  used  in 
loca  ting  a  face  . 

The  first  idea  to  come  to  mind  is  to  use  the  original 
face  recognition  algorithm  (as  developed  by  Russel),  in  a 
direct  manner,  by  developing  a  set  of  windows  and  gestalt 
values  for  a  "generic  face"  and  evaluating  an  image  to  see 
where  and  if  the  gestalt  values  of  the  Image  match  up  with 
the  gestalt  values  of  the  generic  face.  Assuming  that  such  a 
generic  facial  feature  set  could  be  developed,  the  major 
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problem  with  this  appoach  is  that  searching  an  image  in  this 
manner  would  be  a  very  computation  intensive  task. 
Essentially,  every  pixel  location  of  the  image  would  have  to 
be  evaluated  using  several  windows.  If  the  facial  size  were 
variable,  then  additional  calculations  (for  each  pixel 
location)  would  have  to  be  made  over  the  range  of  possible 
sizes.  Using  such  a  technique,  the  number  of  false  alarms 
would  be  expected  to  be  high  when  attempting  to  confirm  the 
presence  of  a  face.  This  high  false  alarm  rate  would  be  due 
mainly  to  the  use  of  a  limited  number  of  large  windows  (i.e. 
six).  The  reduction  of  information  is  so  great  that,  for  any 
particular  window,  a  large  number  of  non-facial  spatial 
distributions  may  result  in  the  required  gestalt. 

On  the  hypothesis  that  perhaps  the  reduction  of 
information  in  the  original  algorithm  was  too  large,  the 
results  of  performing  the  gestalt  calculation  on  larger  sets 
of  smaller  windows  was  examined.  Figure  3-1  illustrates  what 
is  obtained  when  a  64  by  64  image  is  divided  into  64  8  by  8 
windows.  Except  for  a  modification  to  perform  the 
calculation  on  a  8  by  8  instead  of  a  64  by  64  window,  the 
calculation  of  the  gestalt  locations  for  each  8  by  8  window 
(in  figure  3-1)  was  done  in  exactly  the  same  manner  as 
Russel's  original  calculation  (Appendix  A  :  A -  3  )  . 
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single  dark  pixels  within  each  8  by  8  box.  Boxes  which  were 
completely  white  received  no  gestalt  value.  The  reader 
should  note  that  the  facial  Images  in  figure  3-1  were 


enhanced  Co  remove  unwanted  extraneous  shadows  prior  to 
performing  the  gestalt  transformations.  Had  this  not  been 
the  case,  then  every  8  by  8  area  would  have  a  gestalt  point 
in  It,  making  the  pattern  somewhat  more  difficult  to 
ascertain.  Shifting  the  facial  image  slightly  down  (shown  in 


figure  3-lb)  causes  a  corresponding  shift  of  the  gestalt 
locations,  which  as  was  indicated  previously,  tend  to  follow 


the  "center  of  mass'*  of  the  pixel  intensity.  The  only 
discernable  and  consistent  patterns  which  appeared  in  images 


like  these  (figure  3-1)  was  for  the  eyes,  nose  and  mouth  and 
then  only  very  roughly.  The  patterns  obtained  using  the  8  by 
8  window  approach  were  deemed  too  inconsistent  to  be  useful 


m 


in  locating  faces.  However,  the  results  did  indicate  that, 
if  a  consistent  pattern  was  to  be  found,  it  would  have  to  be 
done  using  the  "internal  feature  area"  of  a  facial  image. 


The  "internal  feature  area"  refers  to  the  area  of  the  face 
including  only  the  forehead,  eyes,  nose  and  mouth. 

Another  important  observation  was  the  need  to  enhance 
the  facial  image  to  remove  unwanted  shadows.  In  an  automated 
procedure,  the  enhancement  inevitably  has  to  be  done  based 
on  some  cha rac te r  1  s t  lc ( s  )  of  the  facial  image.  This 
requirement  creates  a  delemma.  In  order  to  use  a  pattern  to 
locate  a  facial  image,  the  facial  image  must  be  properly 
enhanced;  however,  the  facial  image  can't  be  properly 
enhanced  until  the  face  has  been  located  .  This  mandates 
that  any  characteristic  facial  gestalt  patterns,  whatever 


they  aay  be,  must  be  initially  obtained  from  the  unaltered 
original  image. 

Development  of  the  Eye  S 1 gna  tu re .  Whenever  someone  Is 
asked  to  locate  something  In  a  visual  image,  he  typically 
looks  for  some  unique  characteristic  of  that  object.  For 
example,  when  looking  for  a  tank,  one  might  look  for  the 
turret  and  gun  barrel  configuration  or  possibly  the  treads. 
When  looking  for  a  rabbit  the  subject  might  look  for  two 
long  ears.  When  looking  for  a  face  (at  least  from  a  frontal 
view),  the  singularly  most  distinctive  characteristic  to 
look  for  is  the  eyes. 

Motivated  by  the  previous  reasoning  and  the 
observations  made  concerning  figure  3-1,  investigations  were 
conducted  to  determine  what  kinds  of  possible  patterns  could 
be  generated  using  the  Intensity  information  about  the  eyes. 
Figure  3-2  is  indicative  of  the  patterns  obtained  when  just 
the  eye  portion  of  the  original  (unmodified)  facial  Images 
are  subjected  to  Russel's  gestalt  calculation  (22:5-42). 

The  "windows"  used  in  figure  3-2  are  vertical  slices 
which  are  4  pixels  wide  by  64  pixels  long  (indicated  by  the 
vertical  bars).  Horizontal  windows  were  not  examined  since 
the  eyes  are  symmetric  about  the  vertical  axis  and  thus  the 
center  of  mass  (gestalt)  for  all  points  would  essentially 
form  a  straight  vertical  line.  Since  any  Intensity  image 
that  is  symmetric  about  a  vertical  axis  would  form  this  same 
signature,  this  would  be  an  almost  useless  signature. 
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Figure  3-2  Gestalts  for  Eyes  -  ft  by  64  vertical  windows. 


The  gestalt  pattern  Illustrated  In  figure  3-2  Is  fairly 


shift  invariant.  Shift  invariance,  in  this  context,  is  taken 


to  mean  that  if  the  eyes  are  translated  to  another  position, 


the  gestalt  pattern  is  shifted  along  with  the  eyes,  with  no 


change  in  the  size  or  shape  of  the  pattern.  To  maximize  the 


shift  invariance,  the  vertical  windows  were  reduced  in  size 


to  the  highest  resolution  of  the  system,  namely,  1  by  6ft 


windows.  Using  such  windows  resulted  in  patterns  which  were 


i 


*  ‘•■'v'*.*  v*\*  *, 


Figure  3-3  Gestalts  for  Eyes  -  1  by  64  vertical  windows. 

essentially  shift  invariant.  Thus,  no  natter  where  the  face 
was  placed  In  the  iaage,  the  sane  pattern  would  energe. 
Figure  3-3a  Illustrates  a  typical  example  of  what  Is 
obtained  using  1  by  64  windows  on  the  eye  portion  of  the 
unaltered  original  image. 
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The  gestalt  pattern  Illustrated  In  figure  3-3a  can 
hardly  be  called  a  "signature".  However,  figure  3-3b 
illustrates  that,  when  the  "background”  (the  non-inf orsative 
areas  above  and  below  the  eyes)  is  filled  in  with  intensity 
values  representing  mid-grey  scale  level  instead  of  pure 
white,  the  gestalt  pattern  does  begin  to  take  on  a 
discernable  form.  In  fact,  the  form  of  the  pattern  is  not 
unlike  the  one  obtained  by  Bromley  (see  the  row  signature  in 
figure  2-3).  There  is  the  beginning  of  a  characteristic  peak 
at  the  center  of  the  signature  which  seems  to  correspond  to 
the  center  of  the  face.  In  addition,  there  are 
characteristic  minimums  that  seem  to  correspond  to  the  sides 
of  the  face. 

It  was  observed  that  the  gestalt  patterns,  generated  by 
using  background  fill  intensities  other  than  white  (15), 
were  position  sensitive.  Specifically,  when  the  eye  image  is 
moved  in  the  vertical  direction,  the  signature  changes 
significantly.  This  problem  is  easily  solved  by  ensuring 
that  all  eye  images  are  relocated  to  a  standard  position 
within  the  64  by  64  image.  It  was  found  that  the  best 
position  was  at  the  top  of  the  64  by  64  image.  This 
placement  of  the  image  coupled  with  the  optimum  background 
setting  (also  determined  empirically)  results  in  very 
consistent  patterns  for  the  eyes.  It  should  be  noted  at  this 
point  that  there  were  a  total  of  16  grey  levels  available 
with  ”0"  intensity  representing  black  and  an  intensity  of 
"15"  representing  white.  Figure  3-4  illustrates  the  patterns 


the  locations  of  the  three  central  minima  provides  exact 
definition  of  just  where  those  boundaries  lie.  The  minima  to 
either  side  of  the  dual  peaks  are  defined  as  the  left  and 
right  sides  of  the  eyes,  respectively.  The  central  minimum 
is  defined  as  the  center  of  eyes. 

The  manner  in  which  the  eye  signature  is  obtained  is  as 
follows.  An  8  pixel  (vertical)  by  64  pixel  (horizontal) 
window  is  extracted  from  a  64  by  64  whole  face  image.  The  8 
by  64  window  is  placed  at  the  top  of  a  new  64  by  64  image 
and  the  remaining  rows  of  the  64  by  64  image  are  filled  in 
with  a  background  grey  level  intensity  of  12.  The  resulting 
image  is  then  transformed  into  the  eye  signature  pattern  by 
calculating  the  gestalt  point  for  each  of  the  64  vertical 
columns  in  the  image.  The  calculation  used  is  identical  to 
that  used  by  Russel  (22:5-42)  in  determining  the  1-D  gestalt 
transform  and  is  given  as  follows: 

Given  the  discrete  input  image 


where 

k ,  h® 1  ,  .  .  . 

.  ,  64  . 

and 

0  <“  Mkh 

<-  15. 

The  transformed  image  is  given  as: 

T'lh  -  5T  Mkh  exp  H(k-i)/63oT)2  (3-1) 

k®~  1  2 

i , h® 1 ,...., 64  . 

o'  -0 .435 

Once  the  above  transformation  has  been  performed,  the 
gestalt  location  for  each  vertical  column  is  determined  by 
finding  the  location,  in  each  column,  which  contains  the 


largest  value.  The  64  gestalt  locations  can  then  be  directly 
displayed  as  in  figure  3-4  or  saved  as  a  discrete  signal  to 
be  used  in  further  analysis. 

After  ensuring  (subjectively)  that  the  eye  signature 
was  present  for  at  least  one  of  the  several  images  available 
for  each  individual  in  the  data  bank  (45  different  facial 
Images),  an  attempt  was  made  to  characterize  the  eye 
signature.  Although  there  are  several  ways  to  transform  and 
represent  the  eye  signature  curve,  such  as  fitting  the  curve 
to  spline  curves,  polynomial  coefficients  or  Fourier 
coefficients,  a  more  direct  method  was  chosen.  The  method  of 
analysis  of  the  eye  signature  curves  is  performed  by 
directly  evaluating  characteristics  of  the  maxima  and  minima 
present  within  the  curve.  To  avoid  problems  due  to  noise, 
which  might  shift  the  location  of  a  peak  or  create  small 
"false  peaks”,  the  discrete  signal  representing  the  eye 
signature  is  smoothed  by  convolving  it  with  a  gausslan 
function.  This  smoothing  operation  makes  the  gestalt  value 
for  a  given  column  a  function  of  the  gestalt  value  for  that 
given  column,  plus  the  gestalt  values  of  the  two  nearest 
columns  on  both  sides  of  the  given  column.  Once  the  curve 
has  been  smoothed,  each  pair  of  successive  peaks  are 
detected  and  analyzed  to  determine  if  they  meet  the  "eye" 
criteria.  The  local  maxima  and  minima  are  detected  based  on 
the  instantaneous  changes  occuring  in  the  discrete  signal 
and  the  previous  changes  which  have  occurred.  Thus,  if  all 
the  previous  changes  had  been  of  increasing  value,  and  then 


a  change  is  detected  of  decreasing  value,  it  is  surmised 
that  a  peak  has  just  been  passed. 


The  primary  criterion  of  the  eye  signature  is  the 
existence  of  two  successive  peaks  whose  maxima  are  located 
equidistant  from  the  minimum  between  them.  Once  this 
criterion  has  been  met,  the  curve  is  evaluated  to  see  if  the 
slopes  formed  between  the  outer  minima  and  their  respective 
peaks  are  less  than  1.5  times  the  greater  of  the  two  slopes 
taken  between  the  central  minima  and  the  two  peaks.  When 
these  two  criteria  have  been  met,  the  possibility  that  eyes 
are  present  is  confirmed.  It  is  possible  that  other  criteria 
may  also  exist  in  the  eye  signature,  which  were  not 
exploited  in  this  design.  An  exhaustive  analysis  of  all 
similiari ties  for  the  eye  signature  was  not  performed  due 
mainly  to  time  constraints.  One  characteristic  slmlliarlty 
that  was  considered,  but  not  incorporated  as  a  criterion, 
was  that  the  two  peaks  are  roughly  equal  in  height.  However, 
due  to  variations  in  lighting  this  is  not  always  the  case. 
Under  more  controlled  lighting  conditions,  this  criterion 
could  and  should  be  used. 

Development  o  f  Nose/Mou  th  Signature.  The  "eye 
signature"  is  not  unique  to  just  the  eyes.  Many  objects  and 
arrangements  of  objects  could  give  rise  to  the  eye 
signature,  such  as  two  vertical  bars  which  are  darker  than 
the  areas  to  either  side  of  them.  In  order  to  specify  a 
total  facial  signature,  which  Is  more  specific  to  the  face, 
further  development  was  necessary. 
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A  number  of  factors  drove  the  decision  to  Investigate 
the  presence  of  a  nose/mouth  signature.  First,  the  eyes  are, 
at  least  in  part,  defined  by  the  rest  of  the  face.  This  is 
especially  true  in  reduced  quality  or  abstract  Images.  Thus, 
if  one  wishes  to  verify  that  an  object  is  indeed  an  eye, 
they  can  determine  the  context  in  which  it  appears.  If 
another  eye  appears  next  to  it,  and  a  nose  and  mouth  appear 
in  roughly  their  proper  positions  below  the  eyes,  then  one 
can  be  reasonably  certain  that  he  is  looking  at  an  eye,  or 
in  fact  a  face.  Second,  there  is  an  inevitable  need  to 
perform  a  contrast  expansion  on  the  facial  image  to  remove 
noisy  facial  shadows.  In  Russel's  original  algorithm  the 
initial  coatrast  enhancement  is  based  on  the  analysis  of  the 
intensity  data  contained  in  the  region  whose  top  and  bottom 
boundaries  are  defined  by  the  top  of  the  eyes  and  tip  of  the 
nose,  respectively.  The  left  and  right  boundaries  can  be 
defined  roughly  by  vertical  lines  through  the  left  and  right 
eye  pupils.  Although  there  does  not  appear  to  be  an  exact 
correspondence,  the  location  of  the  maxima  for  the  two  peaks 
In  the  eye  signature  are  consistently  close  to  the  positions 
of  the  pupils.  In  addition,  the  row  corresponding  to  the  top 
row  of  the  8  by  64  "eye"  window  is  usually  close  to  what  may 
be  called  the  top  of  the  eyes.  Thus,  the  only  bit  of 
Information  remaining  to  be  determined,  in  order  to  apply 
Russel's  contrast  expansion,  is  the  location  of  the  tip  of 
the  nose  (hereafter  referred  to  as  the  "top  of  nose"). 


Using  essentially  the  same  technique  as  was  used  to 
develop  the  eye  signature,  a  vertical  facial  signature, 
hereafter  referred  to  as  the  nose/mouth  signature,  was 
developed.  In  this  case,  a  vertical  window  (slice)  is 
extracted  from  the  original  facial  image  and  placed  at  the 
extreme  left  side  of  a  new  64  by  64  image  (reference  figure 
3-5).  That  part  of  th  :  facial  image  from  a  vertical  line 
about  the  center  of  the  left  pupil  to  a  vertical  line  about 
the  center  of  the  right  pupil  defines  the  region  of 
interest.  Once  this  sub- image  has  been  placed  at  the  extreme 
left  side,  the  remaining  columns  of  the  64  by  64  image  are 
filled  with  a  grey  level  intensity  of  12.  The  resulting 
image  is  then  transformed  into  the  nose/mouth  signature 
using  essentially  the  same  calculation  as  shown  in  equation 
3-1.  The  only  difference  here  Is  that  now  a  gestalt  is 
calculated  for  each  row  Instead  of  each  column.  Figure  3-5 
Illustrates  examples  of  the  gestalt  patterns  obtained  using 
unaltered  original  Images.  The  approximate  position  of  the 
top  of  the  eye  window  (the  window  that  verifies  the  presence 
of  the  eyes),  from  a  previous  analysis  of  the  full  facial 
image,  is  shown  as  a  solid  horizontal  line  in  the  gestalt 
pattern.  An  investigation  of  the  pattern  below  the  "eyes" 
reveals  that  in  every  case  there  is  a  significant  increase 
in  pixel  intensity  (corresponding  to  the  bright  area 
Immediately  below  the  eyes),  followed  by  a  significant 
decrease  in  pixel  intensity  which  corresponds  to  the  darker 
areas  about  the  top  of  the  nose.  Once  the  location  of 


maximum  brightness  (below  the  eyes)  has  been  determined 
using  a  simple  sort  routine,  the  top  of  the  nose  can  just  as 
easily  be  determined.  The  top  of  the  nose  is  defined  as  the 
point  at  which  the  average  of  the  differences  between  the 
next  three  consecutive  gestalt  points  is  less  than  or  equal 
to  zero.  For  the  purposes  of  this  calculation,  as  one 
proceeds  from  left  to  right  in  the  gestalt  patterns  of 
figure  3-5,  the  value  of  the  gestalts  increase  in  value. 
Applying  this  definition  in  a  search  routine  which  begins  at 
the  point  of  maximum  brightness  (determined  previously),  and 
proceeds  down  the  nose/mouth  signature  curve  (see  figure 
3-5),  results  in  the  "top  of  nose"  locations  illustrated  by 
the  dashed  lines  in  figure  3-5. 

Combining  the  results  of  the  eye  signature  with  the 
results  of  the  nose/mouth  signature,  the  information  thus 
far  gleaned  from  the  original,  unaltered  image  is  as 
follows : 

1.  A  good  approximation  of  the  vertical  location 
of  the  eyes  within  the  64  by  64  image.  Given  by 
the  location  of  top  row  of  the  eye  window. 

2.  The  location  of  the  horizontal  center  of  the  face. 
Given  by  the  location  of  the  central  minima  in  the 
eye  signature. 

3.  The  sides  of  the  eyes.  Given  by  the  minima 
located  to  either  side  of  the  two  peaks  in  the 
eye  s  lgna  tu  re . 

4.  A  good  approximation  of  where  the  eye  pupils  are, 


given  by  the  location  of  the  maxima  of  the  two 
peaks  in  the  eye  signature. 

5.  The  top  or  "tip”  of  the  nose. 

The  only  remaining  major  internal  facial  feature  that 
remains  to  be  located  is  the  mouth.  The  center  of  the  mouth 
(in  the  vertical  direction)  is  usually  characterized  by  a 
sharp  transition  from  a  region  of  low  luminous  intensity 
(corresponding  to  the  upper  lip)  to  one  of  high  luminous 
Intensity  (the  lower  lip)  under  normal  lighting 
circumstances.  Although  this  transition  is  perceptable  in 
the  gestalt  patterns  of  figure  3-5,  it  is  just  barely  so  due 
to  noisy  shadows  around  the  mouth  area.  To  overcome  this 
problem  and  to  enhance  the  overall  accuracy  of  the  feature 
locations,  a  slightly  modified  version  of  Russel's  initial 
enhancement  routine  (22:5-32)  was  used.  In  synopsis, 

Russel's  enhancement  technique  stretches  the  intensity 
histogram  of  an  image  by  multiplying  all  pixel  intensity 
values  by  a  constant.  The  multiplier  constant  is  determined 
by  iteratively  multiplying  all  pixel  Intensity  values,  from 
a  sampled  region  of  the  face,  by  test  multipliers  until  a 
desired  average  pixel  intensity  is  obtained.  The  region  of 
interest  is  a  box-like  region  centered  about  the  nasal  area 
of  the  face.  As  mentioned  earlier,  the  only  parameter 
missing  up  to  this  point  in  the  development,  was  the  top  of 
nose  location.  Once  the  nose  location  Is  known,  the 
boundaries  of  the  box  can  be  completely  defined  relative  to 
the  size  of  the  face  and  the  location  of  the  facial 


features.  The  left  and  right  boundaries  of  the  box  are 
defined  by  the  position  of  the  left  and  right  maxima  for  the 
peaks  in  the  eye  signature.  The  top  boundary  of  the  box  Is 
defined  by  the  position  of  the  top  row  of  the  eye  window  and 
the  bottom  boundary  is  defined  as  the  top  of  nose  location. 
It  is  then  a  simple  matter  to  apply  the  enhancement  as 
developed  by  Russel.  Figure  3-6  illustrates  the  signatures 
obtained  when  the  entire  facial  image  is  enhanced  according 
to  the  modified  Russel  enhancement  technique. 

In  figure  3-6,  the  center  pictures  contain  the  patterns 
for  the  eye  signatures,  and  the  pictures  to  the  extreme 
right  contain  the  patterns  for  the  nose  and  mouth.  The 
pictures  to  the  extreme  left  contain  the  subject  face  which 
has  been  marked  to  indicate  the  feature  locations  found  by 
the  computer.  The  windows  used  to  generate  the  eye  and 
nose/mouth  signatures  (not  shown)  are  the  same  as  described 
before  (ref.  figures  3-4  and  3-5).  In  figure  3-6,  the 
signature  for  the  mouth  is  more  readily  apparent.  The  peak 
corresponding  to  the  lips  of  the  mouth  is  the  next  peak 
below  the  one  for  the  nose.  The  reader  will  recall  that  the 
peak  for  the  nose  is  directly  below  that  for  the  eyes 
(marked  by  a  dark  solid  line  corresponding  to  the  top  row  of 
the  eye  window)  . 

It  is  important  to  note  that  for  these  images  (figure 
3-6),  the  feature  locations  are  found  automatically,  with  no 
assistance  from  a  human  operator.  The  position  of  the  eye 


window  is  determined  automatically  by  scanning  the  facial 
image  (from  the  bottom  up)  until  an  eye  signature  meeting 
the  characteristics  outlined  previously  is  detected.  A 
nose/mouth  signature  is  then  generated  using  the  appropriate 
data  from  the  eye  signature.  These  initial  feature  locations 
are  determined  from  the  original  unaltered  image.  The  facial 
image  is  then  enhanced,  as  described  previously,  and  all 
feature  locations  (including  the  mouth  location)  are 
determined  based  on  the  enhanced  image. 

The  mouth  location  is  determined  by  evaluating  the 
nose/mouth  signature,  in  a  specified  range  below  the  nose, 
for  the  point  of  maximum  negative  slope.  The  center  of  mouth 
location  is  then  taken  as  the  point  immediately  preceding 
the  point  of  maximum  negative  slope.  These  points  are 
Indicated  in  figure  3-6  by  dashed  lines.  Although  the 
technique  of  locating  the  mouth  is  very  simple  and 
s tralght-f orward ,  the  technique  does  appear  to  work  in  the 
large  majority  of  cases.  This  method  also  works  in  many 
cases  where  the  subject  has  a  mustache  or  beard.  Further 
examples  of  eye  and  nose/mouth  signatures  are  contained  in 
appendix  C. 

Searching  for  a  Face .  The  facial  images  used  in 
developing  the  eye  and  nose/mouth  signatures  were  full  face, 
pristine  background,  64  by  64  digital  Images.  These  Images 
were  obtained  from  an  online  data  bank  of  facial  images.  The 
procedure  used,  in  searching  for  the  signatures  in  these 
full  face  images,  is  essentially  no  different  when  applied 


to  the  larger  problem  of  locating  faces  In  an  Image  where 
the  faces  comprise  a  much  smaller  percentage  of  the  total 
Image,  and  where  the  background  can  be  anything.  The 
designer  simply  has  to  make  allowances.  In  the  software,  for 
such  aspects  as  a  the  analysis  of  a  larger  Image,  the 
presence  of  more  than  one  face  and  other  such  aspects. 

Most  of  the  design  decisions  made  at  this  point  were 
dictated  by  the  practical  use  of  limited  computer  resources 
(such  as  memory),  and  the  desire  to  have  the  algorithm 
perform  in  at  least  a  "near  real-time"  manner.  The  following 
is  a  list  of  the  performance  constraints  imposed: 

1.  The  total  size  of  the  search  area  was 
limited  to  a  128  by  192  pixels. 

2.  The  range  of  facial  sizes  allowed  was 
from  a  max  size  approximately  25Z  larger 
than  the  average  face  in  the  data  bank,  to 
a  min  size  approximately  3QZ  smaller  than 
that  in  the  data  bank. 

3.  The  maximum  number  of  faces  that  can  be 
processed  in  any  one  image  is  four. 

4.  Starting  at  the  bottom  of  the  image,  and 
working  its  way  up,  the  search  routine 
analyzes  only  every  other  row  for  an  eye 
signature.  The  routine  does  not  analyze 
every  row. 

The  process  of  searching  for  a  face  is  begun  by 


extracting  a  large  central  portion  of  the  image  displayed  on 


the  aonitor.  The  size  of  the  image,  as  indicated  in  the 
above  list,  is  limited  to  an  area  of  192  pixels  wide  by  128 
pixels  long.  This  size  limitation  is  due  to  dynamic  memory 
limitations  of  the  computer,  a  desire  to  limit  the 
complexity  of  the  computer  code  and  a  desire  to  limit  the 
time  required  to  process  any  one  picture.  This  192  by  128 
image  is  hereafter  referred  to  as  the  "search  area".  The 
location  and  bounds  of  the  search  area  is  illustrated  by  the 
large  white  rectangular  box  cursor  in  Figure  3-7. 

Image  analysis  is  begun  by  extracting  the  bottom  8 
pixel  rows  of  the  search  area  and  transforming  this  8  by  192 
window  into  a  series  of  gestalt  points.  The  transformation 
is  conducted  in  the  same  manner  as  was  described  in 
generating  the  eye  signature.  The  only  difference  is  that 
now  the  gestalt  pattern  generated  consists  of  192  Instead  of 
64  discrete  points.  The  array  of  gestalt  points  is  then 
analyzed  to  detect  the  presence  of  an  eye  signature  using 
the  criteria  set  out  in  the  previous  section  on  development 
of  the  eye  signature.  If  no  eye  signature  is  detected  in  the 
gestalt  array,  the  8  by  192  window  is  Incremented  up  2  pixel 
rows  and  the  whole  process  is  repeated.  This  process  Is 
continued  until  an  eye  signature  is  detected  or  the  search 
area  has  been  completely  scanned.  The  reason  the  window  is 
Incremented  2  rows,  Instead  of  one  row,  is  that  the  time  to 
process  the  entire  picture  is  effectively  cut  in  half.  As 
long  as  the  allowed  facial  image  size  is  not  too  small. 


>> 


there  is  little  chance  that  the  eyes  will  be  missed  by 
incrementing  2  rows  instead  of  one  row. 

If,  while  scanning  the  search  area,  an  eye  signature  is 
detected,  then  a  64  by  64  sub-section  of  the  original  image 
(as  displayed  on  the  moniter)  is  accessed  and  further 
analyzed.  The  white  box  cursor  in  Figure  3-8  illustrates  the 
64  by  64  image  sub-section  which  is  extracted  from  the 
original  image.  This  method  does  not  require  that  the  full 
facial  image  be  contained  within  the  search  area.  It  only 
requires  that  both  eyes  be  contained  within  the  search  area. 

Analysis  of  the  64  by  64  image  sub-section  proceeds  as 
follows.  The  existence  of  the  eye  signature  is  re-verified 
and  the  location  of  the  vertical  features  (center  of  face, 
etc.)  are  determined  relative  to  the  64  by  64  image.  The  top 
of  the  nose  in  the  unaltered  image  is  determined  using  the 
nose/mouth  signature,  as  discussed  previously.  The  image  is 
then  enhanced  using  the  modified  Russel  expansion  technique. 
The  enhanced  image  is  reprocessed,  once  again,  to  determine 
all  the  feature  locations  (except  the  mouth).  This  is  done 
to  obtain  greater  accuracy  in  the  feature  locations.  The 
original  image  is  enhanced  a  second  time,  using  the  modified 
Russel  enhancement,  except  this  time  the  bounds  of  the  box 
area  used  for  enhancement  are  those  determined  from  the 
enhanced  picture  (not  the  original).  Using  this  second 
enhancement,  a  final  determination  of  all  the  facial  feature 
locations,  including  the  mouth,  is  made.  If,  at  any  time 
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during  this  process,  it  is  found  that  a  facial  feature  no 
longer  exists  or  no  longer  falls  within  acceptable  bounds, 
the  process  is  terminated  and  the  operator  is  informed  that 
a  face  was  not  found  in  this  image  sub-section.  If  a  face  is 
found,  that  is,  all  features  exist  and  are  located  in 
appropriate  locations,  then  the  image  along  with  other 
pertinent  data  is  saved  to  disk  and  the  operator  is  informed 
of  the  presence  of  a  face  within  the  image  sub-section.  In 
addition,  when  a  face  is  found,  that  face  is  marked  on  the 
moniter  with  a  line  pattern  depicting  the  feature  locations 
found.  Thus  in  figure  3-9,  the  solid  lines  on  the  facial 
image  depict  the  center  of  face,  sides  of  eyes,  top  of  nose 
and  center  of  mouth.  In  addition,  the  horizontal  line 
Immediately  above  the  eyes  corresponds  to  a  location 
determined  by  adding  the  difference  between  the  two  maxima 
(found  in  the  eye  signature)  to  the  top  of  nose  location. 
Adding  twice  that  difference  to  the  top  of  nose  location, 
results  in  the  line  at  the  top  of  the  forehead.  The  area 
within  the  lined  gridwork  on  the  face,  comprises  the  area  of 
interest  referred  to  previously  as  the  "internal  facial 
features".  This  area  is  used  in  further  analysis  to 
determine  the  identity  of  the  individual. 

Whether  or  not  a  face  is  found,  once  analysis  of  the 
image  sub-section  is  completed,  the  algorithm  returns  to  the 
search  routine  it  was  Involved  in  prior  to  accessing  the 
image  sub-section.  If  a  face  was  found,  then  the  bounds  of  a 
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region  encompassing  Che  area  of  Che  face  are  scored  in 
memory  so  ChaC  Che  face  will  noC  be  "re-found"  on  a 
subsequenC  pass.  The  search  conCinues  from  Che  poinC  ic  lefc 
off  aC,  and  proceeds  from  Che  boCCom  of  Che  search  area  Co 
Che  Cop  unCll  four  facial  images  have  been  found  or  Che 
search  area  has  been  compleCely  scanned.  The  choice  of  a 
four  face  limic  was  arbiCrary.  Again,  keeping  Che  number  of 
possible  faces  limiced,  keeps  Che  speed  of  performance  and 
Che  code  complexiCy  aC  a  reasonable  level. 

The  Face  Recognl zer 

This  Chesis  assumes  ChaC  Che  basic  recognidon 
algorichm  developed  by  Russel  is  valid,  and  only  requires 
minimal  mod i f 1 ca C i ons  Co  make  iC  perform  in  an  auConomous 
fashion.  Only  Che  required  mod i f ica Ci ons  are  discussed  in 
chis  chesis.  For  discussion  concerning  Che  cheory  and  design 
of  lcems  ChaC  are  Che  same  as  used  by  Russel,  Che  reader  is 
referred  Co  Che  approprlaCe  secdons  of  Russel's  Chesis 
(22). 

Window  Gene  r a  C 1  on .  The  major  difference  beCween  Che 
face  recognidon  algorichm  used  by  Russel  and  Che  algorichm 
used  in  this  chesis  is  in  Che  bounds  of  Che  facial  windows. 
The  windows  used  for  face  recognidon  in  this  thesis  must  be 
constructed  from  the  internal  facial  features  only.  In 
Russel's  work,  additional  facial  boundaries  were  available, 
such  as  the  sides  and  top  of  the  head  and  bottom  of  the 
chin.  These  additional  boundaries  could  be  easily  determined 
provided  a  pristine  background  was  always  used.  However,  in 
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the  interest  of  designing  a  more  universal  machine,  it  was 
decided  to  forego  the  use  of  these  boundaries.  As  expected, 
the  loss  of  the  information,  such  as  hairstyle,  obtained 
when  using  the  additional  boundaries,  results  in  a  decrease 
in  the  overall  recognition  performance  (see  chapter  5). 
However,  it  is  this  author's  firm  belief  that  this  drawback 
can  be  overcome.  Indeed,  there  is  ample  evidence  to  show 
that,  at  least  in  the  case  of  familiar  faces,  recognition  is 
predominantly  dependent  on  analysis  of  the  internal  facial 
features  ( 7 ; 2  6  )  . 

The  choice  of  windows  used  in  this  design  is 
illustrated  in  figure  3-10.  The  picture  in  the  upper  left 
corner  of  figure  3-10  is  the  original  image  sub-section 
which  was  extracted  automatically  by  the  face  finder 
algorithm  (see  figure  3-8).  The  center  upper  picture  is  the 
same  image  enhanced  based  on  the  final  contrast  multiplier 
constant  found  during  the  face  location  process.  The  picture 
in  the  upper  right  hand  corner  is  the  enhanced  image  with 
the  internal  feature  boundaries  marked. 

The  windows  indicated  in  the  pictures  in  the  middle 
(row)  and  left  (column)  and  the  middle  center  positions  both 
have  the  same  top  and  bottom  boundaries.  The  bottom  boundary 
is  the  center  of  mouth  location  and  the  top  boundary  is  the 
boundary  formed  by  adding  twice  the  difference  between  the 
maxima  (taken  from  the  eye  signature),  to  the  top  of  nose 
location.  The  vertical  boundaries  for  these  two  images  are 


the  center  of  eyes  and  sides  of  eyes.  Similiarly,  the 
windows  displayed  In  the  pictures  at  the  middle  right  and 
bottom  left  positions  have  the  sane  boundaries,  except  that 
the  top  of  nose  location  now  forms  the  bottom  boundary.  The 
windows  depicted  in  the  pictures  located  at  the  bottom 
middle  and  bottom  right  positions  also  have  the  same 
vertical  boundaries  (sides  and  center  of  eyes).  Their  bottom 
boundaries  are  determined  by  the  center  of  mouth.  The  top 
boundaries  for  these  two  windows  are  obtained  by  adding  the 
distance  between  the  sides  of  the  eyes  to  the  center  of 
mouth  location. 

The  selection  of  the  windows  represents  a  rough,  first 
cut  at  the  problem.  Time  constraints  did  not  allow  an 
exhaustive  development  of  "optimal"  windows.  The  selection 
was  based  on  a  subjective  analysis  of  those  boundaries  that 
might  give  the  best  discrimination  in  the  gestalt  feature 
space. 

Gestalt  Calcula  tl on .  The  gestalt  values  for  the  windows 
shown  in  figure  3-10  are  calculated  in  exactly  the  same 
manner  as  was  done  in  Russel's  algorithm.  The  reader  is 
referred  to  appendix  A  and  Russel's  thesis  (22:5-42)  for 
further  information  on  how  this  calculation  is  performed. 

Training  for  a  Face .  In  order  for  the  system  to 
identify  an  person,  it  must  be  "trained"  for  that  person.  As 
discussed  in  the  introduction  to  this  thesis,  training  the 
system  consists  of  processing  the  facial  images  of  an 
individual  to  obtain  characteristic  gestalt  values 


associated  with  that  Individual.  These  gestalt  values  are 
then  stored  In  the  systems  memory  for  later  recall  and 
comparison  with  the  gestalt  values  of  an  unknown  Individual. 
If  the  stored  gestalt  values  match  closely  with  that  of  the 
unknown  Individual,  then  the  system  usually  is  successful  In 
properly  identifying  the  unknown  individual. 

Since  faces  are  extremely  variable  (even  with  respect  to 
internal  features),  it  is  necessary  to  train  the  system  with 
more  than  one  characteristic  facial  image.  The  guideline  of 
training  with  four  pictures  for  each  person,  as  established 
by  Russel,  was  also  used  in  this  thesis. 


The  process  of  training  the  system  is  the  same  process 
as  was  described  in  the  preceding  sections  on  searching  for 
a  face,  window  generation  and  gestalt  calculation.  Once  the 
facial  images  have  been  located,  the  operator  may  choose  to 
either  train  with  the  facial  images  that  were  found  or 
identify  the  facial  images.  If  the  operator  chooses  to 
train,  then,  as  the  window  gestalt  calculations  for  each 
image  are  completed,  the  operator  inputs  which  person  these 
parameters  pertain  to,  then  the  next  image  is  processed. 
Figure  3-11  illustrates  a  sample  set  of  images  used  to  train 
the  system  for  the  test  subject  of  figures  3-7  through  3-10. 

Identifying  the  Face.  If  the  operator  desires  to 
identify  the  faces  found  by  the  face  finder  algorithm,  the 
system  attempts  to  identify  the  faces  in  the  order  that  they 
were  found.  First,  the  windows  and  gestalt  values  for  each 
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inage  are  generated  (as  in  figure  3-10).  The  system  then 
compares  the  gestalt  values  obtained,  with  those  for  which 
it  has  been  trained.  The  individual  that  most  closely 
matches  the  gestalts  for  the  unknown  person  is  chosen  by  the 
computer  as  the  number  one  candidate.  The  recognition 
algorithm  rank  orders  the  candidates  and  computes  a 
pseudo-probability  measure  to  indicate  how  certain  it  is 
that  the  candidate  chosen  is  the  proper  one.  Again,  the 
algorithm  used  to  perform  recognition  is  an  unmodified 
algorithm  developed  by  Russel.  The  reader  is  referred  to 
appendix  A  and  Russel's  thesis  (Appendix  A : A-7 ; 22 : 4-3 2 )  for 
a  more  detailed  explanation  of  the  selection  process.  Figure 
3-12  (on  the  next  page)  illustrates  the  final  result  of  the 
autonomous  face  recognition  machine  algorithm. 


IV.  Implementation 


This  chapter  explains  how  the  design  discussed  in 
chapter  3  was  implemented  on  the  systems  available  at  the 
AFIT  Signal  Processing  Laboratory.  The  first  section 
discusses  the  hardware  and  software  environment  used. 
Subsequent  sections  discuss  the  software  implementation  of 
the  major  functions  used  to  locate  and  identify  faces. 

Sy 8  tern  Environment 

The  computer  used  to  develop,  implement  and  test  the 
face  locator  and  face  recognizer  algorithms  was  a  Data 
General  computer  system.  This  system  is  comprised  of  two 
separate  computers: 

1.  A  Data  General  Eclipse  S/250,  16-bit  minicomputer. 

2.  A  Data  General  Nova  II,  16-blt  minicomputer. 

The  Eclipse  computer  is  the  more  powerful  of  the  two 

computers  and  is  primarily  used  for  calculations.  The  Nova 
computer,  while  not  as  fast  or  powerful  as  the  Eclipse,  is 
capable  of  interacting  with  a  video  digitizing  system  and 
is  primarily  used  to  acquire  and  display  image  data.  The  two 
computers  are  linked  by  common  disk  storage  areas  and  can 
communicate  via  "flag"  and/or  data  files  which  can  be 
created  or  deleted  (by  either  computer)  in  the  common  disk 
storage  area.  The  amount  of  user  available  core  memory  in 
either  system  is  approximately  28K  bytes,  at  any  one  time. 
This  memory  limitation  mandated  the  extensive  use  of  such 
techniques  as  program  overlaying  and  swapping  in  order  to 
perform  the  required  algorithms. 


The  video  digitizing  system  used  in  conjunction  with 
the  Nova  computer  is  an  Octek  2000  Image  Analyzer  Card 
(IAC).  This  digitizer  system  is  capable  of  acquiring  or 
displaying  a  320  (horizontal)  by  240  (vertical)  pixel  image. 
The  Octek  processes  only  monochrome  (black  and  white) 
images.  The  number  of  grey  levels  available  on  it  is  16.  The 
video  camera  used  with  this  system  is  a  Dage  650  camera 
which  is  equipped  with  an  adjustable  F-Stop  (F2.5-F16)  and  a 
zoom  lens  (18-108mm).  The  overall  system  configuration  used 
is  identical  to  that  used  by  Russel  (22)  and  is  Illustrated 
in  figure  4-1. 

The  face  locator  and  recognition  algorithms  were 
implemented  in  the  Fortran  computer  language.  Specifically, 
those  programs  which  execute  on  the  Nova  are  compiled  in 
Data  General  Fortran  IV,  and  those  which  execute  on  the 
Eclipse  are  compiled  in  Data  General  Fortran  V.  These  Data 
General  Fortran's  conform  to  ANSI  standards,  thus  aiding 
program  transportability  from  one  machine  to  another. 
However,  the  extensive  use  of  library  functions  in  the 
programs,  both  Fortran  and  IAC  library  functions,  may  make 
the  task  of  transporting  the  software  quite  difficult. 

Top  Le ve 1  Programs 

Each  computer  (Nova  and  Eclipse)  has  a  top  level 
manager  program  which  is  initiated  by  executing  a  macro  file 
on  the  desired  system.  On  the  Eclipse  the  macro  file  is 
" F INDFACE . MC "  and  on  the  Nova  it  is  "AUTOFACE . MC" .  These 


Figure  4-1.  Syatea  Configuration. 


macro's  ensure  Chat  the  system  is  initialized  correctly 
before  bringing  up  the  top  level  manager  programs.  On  the 
Eclipse,  proper  initialization  means  ensuring  that  any 
needed  directories  are  initialized  and  any  old  data  or  flag 
files  have  been  deleted.  Once  this  has  been  accomplished, 
the  top  level  program  "FACEFNDR"  is  executed  on  the  Eclipse. 
The  Nova  is  initialized  in  the  same  manner,  plus  the  Nova 
ensures  that  the  program  FACEFNDR  has  been  successfully 
initiated  on  the  Eclipse.  To  verify  that  the  Eclipse  is 
ready,  the  Nova  checks  the  common  disk  area  directory 
(directory  NSMITH)  for  the  presence  of  the  flag  file 
" GREEN L IGH  T" .  The  file  GREENLIGHT,  is  a  "flag"  file  because 
it  is  a  file  which  contains  no  data  (a  0  length  file).  This 
file  i s  created  by  FACEFNDR  when  it  begins  execution.  By 
virtue  of  its  existence  in  the  directory,  the  file 
GREENLIGHT  is  a  signal  to  the  Nova  that  the  Eclipse  has 
successfully  performed  the  FINDFACE  initialization  routine. 
Many  such  flag  files  are  used  in  this  implementation  to 
allow  communication  and  task  synchronization  between  the 
Eclipse  and  Nova  systems.  If  the  system  is  to  perform 
correctly,  careful  housekeeping  of  these  flag  files  is 
paramount.  Thus,  once  a  flag  file  has  served  its  purpose,  it 
is  immediately  deleted  from  the  directory. 

When  verification  of  Eclipse  initialization  has  been 
accomplished,  the  Nova  top  level  program  "GETSUBJ"  is 
executed.  Both  top  level  programs  (FACEFNDR  and  GETSUBJ) 
execute  in  a  continuous  loop  until  receipt  of  a  command  from 


the  operator  to  terminate.  The  program  FACE PNDR  is  a 
synchronized  program  which  responds  to  signals  (in  the  form 
of  flag  files)  from  the  Nova.  Thus,  once  the  system  has  been 
initialized  all  operator  commands  are  given  from  the  Nova 
terminal.  Figures  4-2a  and  4-2b  illustrate  the  top  level 
program  flow  for  both  the  Eclipse  and  Nova  systems. 

Image  Acq u 1  a i 1 1  on 

In  order  to  process  an  image,  the  desired  image  must  be 
loaded  into  the  Octek  digitizer.  There  are  two  ways  to 
accomplish  this.  One  way  is  to  acquire  the  image  through  the 
use  of  the  video  camera.  This  can  be  accomplished  directly 
through  GETSUBJ  by  selecting  the  program  option  to  activate 
the  camera,  setting  up  the  picture,  and  then  selecting  the 
program  option  to  turn  off  the  camera.  Turning  off  the 
camera,  freezes  the  image  in  the  Octek.  Another  way  to  load 
the  image  in  the  Octek,  is  to  use  the  program  "OCTEK”  to 
load  (and  display)  a  previously  saved  image. 

Once  the  desired  image  is  displayed  on  the  monitor 
screen  (and  thus  loaded  in  the  Octek's  frame  buffer), 
selection  of  the  GETSUBJ  program  option  "process  picture", 
initiates  the  acquisition  and  anaylsls  of  the  image  search 
area.  The  reader  will  recall  that  the  image  search  area  is  a 
192  by  128  pixel  area  extracted  from  the  central  portion  of 
the  octek  image  (see  figure  3-7).  GETSUBJ  stores  the  search 
area  image  data,  to  the  common  disk  directory  "NSMITH",  as  a 
file  named  " SU BJ ECT . VD" .  Analysis  of  the  image  data  (by 
FACEFNDR  on  the  Eclipse)  is  then  initiated  by  the  creation 
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Figure  4-2b.  Top  Level  Prograa  Flow  (continued) 


of  the  flag  file  " SU BJ EC T1 . A ” .  The  high  level  flow  chart  for 
the  "process  picture"  option  Is  Illustrated  in  figures  4  -  3  a 
and  4-3b. 

Image  Processing  to  Find  Faces 

Once  the  program  FACEFNDR  has  been  initiated  on  the 
Eclipse,  It  continuously  checks  for  the  existence  of  various 
flag  files  In  directory  NSMITH.  Upon  detecting  the  existence 
of  the  flag  file  SUBJECT1.A,  FACEFNDR  Initiates  analysis  of 
the  data  In  SUBJECT. VD,  for  the  eye  signature. 

Starting  with  the  bottom  8  rows  of  the  Image  search 
area,  FACEFNDR  successively  determines  the  gestalt  value  for 
each  column  of  the  8  row  by  192  column  window.  The  gestalt 
values  are  determined  in  the  same  manner  as  discussed  In 
chapter  3.  To  the  bottom  of  each  column  of  8  pixels  (from 
the  window  on  the  search  area),  56  more  pixels  are  added,  to 
construct  a  single  column  of  64  pixels  in  total  length.  The 
Intensity  value  assigned  to  these  added  pixels  Is  12.  This 
single  column  of  64  pixels  is  then  transformed  via  the 
gestalt  calculation  Indicated  in  equation  3-1.  The  actual 
transformation  Is  performed  by  the  subroutine  "RTRAM".  The 
transformed  column  Is  then  evaluated  by  FACEFNDR  to 
determine  the  location  of  the  row  containing  the  maximum 
value.  The  location  of  the  maximum  is  then  stored  as  the 
gestalt  value  for  that  column.  The  determination  of  the 
gestalt  values  proceeds  In  this  manner,  column  by  column, 
until  all  192  columns  have  been  processed.  The  gestalt 


values  are  saved  In  the  array  "STRIPDATA''  which  Is  a  400 
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eleaent  Integer  array.  The  first  384  elements  of  this  array 
are  reserved  for  gestalt  location  data  with  odd  numbered 
elements  containing  the  row  location  and  even  numbered 
elements  containing  the  column  location  of  the  gestalt 
point.  The  column  locations  are  arranged  in  increasing  order 
within  the  array.  Thus,  element  2  of  the  array  contains  the 
number  "’l"  to  indicate  that  the  row  value  in  element  1  of 
the  array  is  associated  with  column  1.  Element  4  of  the 
array  contains  the  column  number  "2”,  element  6,  the  column 
number  “3",  etc. 

Once  the  STRIPDATA  array  has  been  filled  with  the 
gestalt  data  generated  from  the  8  by  192  window,  this  data 
is  passed  to  the  subroutine  "EYEF"  which  analyzes  the  data 
for  the  presence  of  an  eye  signature.  In  order  to  reduce  the 
problems  due  to  noise  in  the  signature,  the  data  in 
STRIPDATA  1 8  smoothed  by  convolution  with  a  gaussian 
function.  To  perform  this  operation,  the  "magitude"  of  the 
gestalt  must  be  defined.  This  magnitude  Is  defined  as  the 
difference  between  the  gestalt  row  value  for  any  given 
column  and  a  baseline  row  value  equal  to  36.  The  baseline 
value  Is  the  gestalt  row  value  obtained  when  all  8  pixels 
(In  the  window)  are  white.  Using  this  definition  for  the 
magnitude,  the  smoothed  gestalt  values  for  any  particular 
column  (at  location  ”  N  "  )  are  determined  by  adding,  to  the 
magnitude  at  the  column  of  Interest,  0.7  times  the  magnitude 
of  the  gestalts  for  the  columns  to  either  side  (at  locations 
N+l  and  N-l),  and  0.3  times  the  magnitude  of  the  the 
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gestalts  for  the  columns  once  removed  and  to  either  side  (at 
locations  N+2  and  N-2). 

A  point  by  point  analysis  of  the  smoothed  gestalt  curve 
is  then  conducted  to  detect  an  eye  signature.  The  analysis 
proceeds  from  left  to  right  (in  relation  to  the  original  8 
by  192  window),  which  corresponds  to  starting  with  column  1 
and  proceeding  in  increasing  order.  As  each  pair  of 
successive  peaks  is  detected,  they  are  tested  to  see  if  they 
meet  the  criteria  for  an  eye  signature.  Recall  that  the 
criteria  for  an  eye  signature  are  as  follows: 

1.  Two  peaks  with  maxima  located  equidistant 
from  a  central  minima. 

2.  The  outer  slopes  of  the  two  peaks  are  not 
greater  than  1.5  times  the  greater  of  the 
two  inner  slopes. 

In  reference  to  criterion  2  above,  the  respective 
slopes  (illustrated  in  figure  4-4)  are  determined  as  the 
difference  between  the  magnitudes  at  the  respective  maxima 
and  minima  locations,  divided  by  the  difference  between  the 
respective  maxima  and  minima  column  locations.  It  is 
Important  to  note  how  the  minima  locations  are  determined 
when  they  occur  on  a  "plateau"  or  in  a  region  where  the 
magnitude  is  unchanging  over  several  columns.  In  the  case  of 
the  central  minima,  the  minima  location  is  taken  as  the 
half-way  point  along  the  plateau.  For  the  left  outer 
minimum,  the  location  of  the  minimum  is  taken  as  the  extreme 
right  of  the  plateau.  For  the  right  outer  minimum,  the 
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location  of  the  alnlaua  la  taken  as  the  extreae  left  of  the 
plateau.  In  reference  to  criterion  1  above,  the  acceptable 
tolerance  for  the  equidistant  location  of  the  central  ainiaa 
is  set  at  7.5Z  of  the  difference  between  the  two  aaxiaa 
locations . 


Figure  4-4.  Eye  Signature  Slopes. 

Depending  on  whether  or  not  an  eye  signature  Is 
detected,  EYEF  updates  elenent  385  of  the  STRIPDATA  array  to 
Indicate  that  possible  eyes  were  found  (eleaent  385*1)  or  no 
eyes  were  found  (eleaent  385*0).  If  eyes  were  found,  then 
the  locations  of  the  ainiaa  and  aaxiaa  are  also  recorded  in 
STRIPDATA.  The  STRIPDATA  array  Is  then  passed  back  to  the 
calling  prograa  (FACEFNDR)  for  evaluation. 

If  a  possible  eye  signature  Is  not  located  by  EYEF, 
then  FACEFNDR  proceeds  to  access  the  next  8  by  192  window  of 
the  search  area.  This  window  Is  shifted  2  pixel  rows  up  froa 
the  first,  and  likewise,  all  subsequent  windows  are 


Incremented  up,  2  pixel  rows  at  a  time.  This  process 
continues  until  a  possible  eye  signature  is  found  or  the 
search  area  has  been  completely  scanned. 

When  a  possible  eye  signature  is  detected  by  the 
subroutine  EYEF,  FACEFNDR  verifies  that  the  location  of  the 
eye  signature  is  not  within  the  facial  boundaries  of  a 
previously  found  face.  If  it  is,  then  FACEFNDR  ignores  the 
indication  and  proceeds  with  scanning  the  search  area.  If 
the  eye  signature  lies  in  a  region  where  a  face  has  not  been 
previously  found,  FACEFNDR  initiates  a  request  to  the  Nova 
to  extract  a  64  by  64  image  subsection  of  the  region  of 
interest.  FACEFNDR  initiates  this  request  by  creating  the 
file  CORDPTS64 . B  which  contains  the  location  data  necessary 
for  the  Nova  to  extract  the  proper  sub-image. 

Upon  detecting  the  presence  of  the  file  C0RDPTS64.B, 
the  program  GETSUBJ  obtains  the  necessary  information  from 
the  file,  extracts  and  saves  the  appropriate  64  by  64  image 
sub-sectioa  to  disk  ,  and  notifies  the  Eclipse  that  the 
image  sub-section  is  ready.  The  flag  file  used  to  Indicate 
that  the  image  sub-section  is  ready  for  further  processing, 
is  the  file  "NOVASIGI .A". 

When  FACEFNDR  detects  the  presence  of  NOVASIGI. A,  It 
begins  the  process  of  analyzing  the  data  in  the  Image 
sub-section.  The  locations  of  the  maxima  and  minima  (from 
the  eye  signature)  are  first  converted  to  locations  relative 
to  the  64  by  64  matrix.  A  nose/mouth  signature  Is  then 
generated  using  the  "center  of  face”  window  as  discussed  In 


chapter  3  (see  figure  3-5).  In  this  case  the  individual  row 
gestalt  transf oria tions  are  perforaed  by  the  subroutine 
“RTRANSB**.  The  resulting  64  "row  gestalt**  values  are  stored 
in  the  300  eleaent  Integer  array  "GESTDATA" ,  beginning  et 
eleaent  nuaber  129.  The  data  in  GESTDATA  are  then  analyzed 
to  deteraine  the  presence  and  locatloo  of  the  top  of  the 
nose.  If  the  nose  is  found,  then  the  entire  iaage  subsection 
is  enhanced  and  re-analyzed.  Enhanceaent  is  perforaed  by  use 
of  the  aodlfied  Russel  contrast  expansion  technique 
(discussed  previously).  The  Halts  which  bound  the  box-like 
saaple  region,  upon  which  the  enhanceaent  is  based,  are 


given  by: 


1.  Top  Llait  -  top  of  eye  window  +  1/2  the 

difference  between  location 
of  the  eye  signature  left 
aaxlaua  and  the  center  of  eyes 
(boundary  "a"  in  figure  4-5). 

2.  Bottoa  Llait  ■  top  of  nose  location 

(boundary  "b"  in  figure  4-5). 

3.  Left  Llait  -  eye  signature  left  aaxlaa  location 

(boundary  "c"  in  figure  4-5). 

4.  Right  Limit  •  eye  signature  right  maxima  location 

(boundary  "d"  in  figure  4-5). 

The  significant  difference  between  this  technique  and 
the  Initial  enhanceaent  originally  used  by  Russel,  is  that 
the  enhanceaent  does  not  require  that  the  image  subsection 
contain  a  standard  size  facial  image,  which  Is  located  in 


standard  position  within  the  image  sub-section.  In  the 
"modified "  Russel  expansion,  the  location  and  bounds  of  the 
box-like  region  are  determined  based  on  the  location  of  the 
face  and  the  facial  features.  The  only  other  difference  is 
that  the  contrast  multiplier  is  determined  based  on 
attaining  an  average  pixel  value  (for  the  box-like  region) 
of  14.5,  and  not  13.0.  Otherwise,  the  contrast  expansion  is 
essentially  the  same  as  the  original  Russel  expansion 
technique  and  the  reader  is  referred  to  his  thesis  (22:5-32) 
for  further  details. 

Re-analysis  of  the  enhanced  facial  image  starts  with  a 
determination  of  where  the  eye  signature  is  now  located. 
Starting  at  the  vertical  position  where  the  eye  signature 
originally  occurred  and  proceeding  upwards,  the  enhanced 
image  is  scanned  with  an  8  by  64  eye  window  to  determine  the 
new  location  of  the  eye  signature.  In  many  instances,  the 
eye  signature  location  changes  because  the  enhancement 
process  significantly  reduces  the  shadows  around  the  eyes. 
This  causes  the  eye  window  to  move  further  up  on  the  facial 
image,  before  the  eyes  are  detected.  Actual  detection  of  the 
eye  signature  is  verified  by  the  subroutine  "FTHOR”,  which 
is  essentially  a  clone  of  the  subroutine  EYEF.  The  only 
significant  difference  between  the  two  subroutines  is  that 
PTHOR  works  on  64  pixel  wide  windows  and  EYEF  works  on  192 
pixel  wide  windows.  Once  the  eyes  are  found  again,  the 
process  of  determining  the  nose  location  Is  repeated. 
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When  the  nose  Is  located,  the  original  image  is  enhanced 
again  (to  an  average  pixel  value  of  14.5  for  the  box-like 
sample  region),  only  this  time  the  bounds  of  the  box-like 
region  are  determined  from  the  feature  locations  as 
determined  from  the  enhanced  image.  This  second  iteration  is 
performed  to  guard  against  the  possibility  that  the  first 
sample  region  was  adversely  affected  by  unusual  facial 
shadows.  It  also  improves  the  accuracy  and  consistency  of 
the  facial  feature  locations.  Based  on  the  second 
enhancement,  the  facial  features  are  determined  a  final  time 
and  stored  in  the  GESTOATA  array. 

With  the  exception  of  the  subroutine  calls  indicated  in 
the  discussion  above,  the  analysis  of  the  64  by  64  image 
sub-section  is  performed  by  "in-line"  code  in  the  program 
FACE  FNDR .  If  at  any  time  during  the  process,  the  program 
falls  to  find  what  it  is  searching  for,  or  finds  what  it  is 
searching  for,  but  in  an  unacceptable  location,  the  analysis 
of  the  image  sub-section  is  terminated.  If  the  analysis  is 
terminated  due  to  a  failure,  an  "eye  found"  indicator 
(GESTDATA  element  number  257)  is  reset  to  "0”,  and  the  Nova 
is  notified  via  the  flag/data  file  "C00RDPTS 1 . B” ,  which 
contains  the  data  from  the  GESTOATA  array. 

If,  on  the  other  hand,  all  facial  features  have  been 
located,  then  GESTOATA  (257)  contains  a  ”1".  As  before,  the 
Nova  is  notified  via  the  flag/data  file  C00RDPTS1.B.  Upon 
detecting  the  presence  of  C00RDPTS1.B,  the  top  level  program 
on  the  Nova,  CETSUBJ,  reads  in  the  file  C00RDPTS1.B,  storing 
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the  values  la  Its  GESTDATA  array.  When  It  Is  determined  that  I 

t 

t 

the  value  of  GESTDATA ( 257 )  Is  non-zero,  the  operator  Is  j 

informed  of  a  successful  location  of  a  facial  image.  If  the  ! 

I 

facial  image  is  small  enough  to  contain  the  entire  Internal  i 

( 

facial  region  in  a  64  by  64  image,  then  the  image  is  saved  j 

as  TESTX.PI.  Where  "X"  is  a  number  from  1  to  4,  depending  on 
whether  this  is  the  first,  second,  third  or  fourth  facial 
image  found.  If  the  bounds  of  the  Internal  facial  region  are 
too  large,  then  the  operator  is  informed  that  a  face  was 
found  but  the  facial  image  is  too  large  to  be  processed  by 
the  system.  I 

Along  with  the  image  file,  GETSUBJ  also  saves  the  data 
indicating  the  facial  feature  locations  and  the  final 
contrast  multiplier  as  determined  by  FACEFNDR.  GETSUBJ  then 
displays  (on  the  monitor)  a  set  of  grid  marks  which  outline 
the  location  of  the  internal  face  features.  Provided  that 

this  was  not  the  fourth  facial  image  to  be  found,  GETSUBJ  I 

resumes  a  cyclic  check  for  further  processing  requests  from 
the  Eclipse.  Meanwhile  FACEFNDR  resumes  its  scan  of  the 
image  search  area  for  other  locations  containing  eye 
s 1 gna  tu  re  s . 

Generating  Facial  Gestalts 

There  are  two  normal  ways  in  which  the  scan  of  the 
search  area  (by  FACEFNDR)  may  be  terminated.  The  first  Is  by 
interrupting  the  process  by  striking  the  octek  keypad  #7, 
after  a  facial  Image  has  been  found.  This  action  generates 
the  creation  of  the  flag  file  "PACEDONE"  by  the  program 
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GETSUBJ.  This  results  in  immediate  termination  of  the  scan 
procedure  on  the  Eclipse.  The  second  way  the  scan  is 
terminated  is  by  normal  completion  of  the  scan  of  the  search 
area.  In  either  case,  the  FACEFN^R  program  is  reinitialized 
and  returns  to  its  main  loop  of  checking  for  Interrupts  from 
the  Nova. 

On  the  Nova,  the  program  GETSUBJ  prompts  the  operator 
to  determine  whether  or  not  he  would  like  to  train  with  the 
facial  images,  identify  the  facial  Images  or  just  quit.  If 
the  operator  chooses  to  quit,  then  GETSUBJ  is  reinitialized, 
and  the  system  is  ready  to  start  over  again.  If  the  operator 
chooses  one  of  the  other  two  options,  then  the  first  thing 
that  will  happen  (regardless  of  which  option  is  chosen)  is 
the  generation  of  facial  window  gestalts  for  the  first  face 
image  that  was  found. 

The  program  GETSUBJ  initiates  the  generation  of  the 
facial  window  gestalts  by  creating  the  data  file  "NOVASIG1” 
and  the  flag  file  "CALCGEST".  GETSUBJ  then  executes  the 
program  GTGEST,  on  the  Nova,  by  performing  a  program  "swap". 
The  use  of  program  swapping  is  fairly  extensive  from  this 
point  on,  so  a  brief  word  about  this  technique  appears  to  be 
in  order.  When  a  program  is  swapped  for  another  program,  the 
calling  program  is  saved  to  disk  along  with  most  of  the 
parameters  associated  with  it.  The  program  being  swapped  in, 
which  replaces  the  calling  program,  Is  then  loaded  into  core 
and  executed.  When  the  current  program  completes  execution, 
or  it  reaches  a  "GALL  BACK"  statement,  the  calling  program 


Is  reloaded  into  core  and  resuaes  execution  at  the  next 
executable  statement,  after  the  swap  statement.  This  swap 
feature  makes  it  possible  to  execute  very  lengthy  routines 
In  a  very  limited  core  memory  environment.  Thus,  when  the 
program  GTGEST  has  completed,  control  returns  to  the  top 
level  program  on  the  Nova,  GETSUBJ. 

The  flag  file  CALCGEST  Is  a  signal  to  the  FACEFNDR  (on 
the  Eclipse)  to  set  up  for  calculating  gestalt  values. 
FACEFNDR  accomplishes  this  by  swapping  in  the  program 
"CORTRAN 1 6" .  C0RTRAN16  is  the  same  program  used  by  Russel 
(22:5-30)  for  calculating  the  window  gestalts.  The  data  file 
NOVASIG1  contains  instructions  as  to  which  facial  Image  file 
(found  and  saved  previously  by  the  face  locator  process)  is 
to  be  processed.  When  GTGEST  begins  execution,  it  examines 
the  contents  of  NOVASIG1  to  determine  which  facial  Image  to 
process.  The  order  of  processing  is  always  in  increasing 
order  from  TEST1.PI  up  to  TESTA. PI. 

After  accessing  the  appropriate  facial  image  file, 
GTGEST  displays  the  unaltered  facial  image  on  the  monitor. 
The  image  is  then  enhanced  based  on  the  final  contrast 
multiplier  which  was  stored  with  the  image.  The  enhanced 
image  is  displayed  on  the  monitor,  next  to  the  original 
image,  as  is  the  enhanced  image  with  the  feature  locations 
marked,  as  illustrated  in  the  top  row  of  figure  3-10.  GTGEST 
then  performs  a  program  swap  with  the  program  PROCESS2. 
PR0CESS2  generates  the  six  facial  windows,  based  on  the 
boundaries  described  In  chapter  3  (see  the  section  on  window 
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generation).  These  64  by  64  iaage  files  are  saved  to  disk  as 
WIND1.PI  through  WIND6.PI  and  are  also  displayed  to  the 
aonltor  (reference  bottom  two  rows  of  figure  3-10).  As  each 
window  is  saved  to  disk,  a  corresponding  flag  file 
(NOVASIC1.A  through  N0VASIG6.A)  is  generated  to  initiate 
processing  by  C0RTRAN16.  While  C0RTRAN16  is  determining  the 
six  window  gestalts,  PR0CCSS2  terminates,  and  control  is 
returned  to  the  GTGEST  program.  GTGEST  immediately  performs 
another  program  swap  with  the  program  SHOWGEST.  The  purpose 
of  SHOWGEST  is  to  retrieve  the  window  gestalt  values, 
generated  by  C0RTRAN16,  and  display  these  values  to  the 
monitor  (above  the  respective  window). 

Once  the  process  of  generating  the  window  gestalt 
values  is  completed,  control  again  returns  to  the  top  level 
programs  (GETSUBJ  on  the  Nova,  FACEFNDR  on  the  Eclipse). 
Training  the  Facial  Database 

If  the  operator  selected  to  train  with  the  facial 
Images  found  by  the  face  locator  process,  then  he  is  now 
prompted  for  a  name  for  this  facial  image  file.  The  operator 
is  also  requested  to  assign  an  identification  (ID)  number  to 
the  facial  image.  This  ID  number  is  the  way  in  which  the 
system  "knows"  an  individual.  Thus,  all  facial  Image' 
assigned  to  a  certain  ID  number  are  treated  as  the  same 
individual. 

With  the  exception  of  the  program  GETS  a 'll  .  •  - 

used  to  store  the  gestalts  and  other  per"--'  ■ 
in  the  database,  WR.NA2  and  TRAIN,  t  -  •  •- 


as  described  in  Russel's  thesis.  Therefore  the  reader  is 
referred  to  Russel's  thesis  (22:5-31,0-6)  for  further 
details  concerning  these  programs.  The  only  significant 
modification  made  to  TRAIN  was  to  modify  it  so  that  it  would 
perform  in  the  directory  NSMITH .  The  purpose  of  GETNAME  is 
to  prompt  the  user  for  a  filename  for  the  facial  image  which 
has  just  been  processed  (for  which  window  gestalts  were  just 
calculated).  The  facial  image  file  is  then  saved  to  disk 
with  the  name  given  by  the  operator.  All  three  programs  are 
called  into  execution  by  program  swap  calls  from  GETSUBJ. 

Once  the  facial  image  database  has  been  updated,  by  a 
single  sequence  of  calls  to  GETNAME,  URNA2  and  TRAIN,  the 
next  facial  image  (if  any)  is  processed.  The  program  GETSUBJ 
can  determine  how  many  pictures  need  to  be  processed,  based 
on  the  value  of  the  variable  "IPIXCT”  which  is  Incremented 
during  the  face  location  process.  If  another  face  needs  to 
be  processed  then  the  process  starts  over  again  at  the  point 
where  GETSUBJ  initiates  the  "generation  of  facial  gestalts” 
(see  previous  section,  3rd  paragraph). 

Recognizing  Faces 

If  instead  of  selecting  to  train  the  data  base,  the 
operator  chose  to  identify  the  facial  Images,  then  after  the 
facial  window  gestalts  had  been  determined,  as  described 
previously,  the  algorithm  would  proceed  Immediately  to  the 
phase  of  trying  to  identify  the  individual. 

On  the  Eclipse,  having  completed  the  window  gestalt 
calculations,  the  program  C0RTRAN16  terminates  and  returns 


control  to  FACEFNDR.  FACEFNDR,  In  turn,  returns  to  checking 
for  interrupts  from  the  Nova  and  is  ready  to  begin  a  new 
task.  On  the  Nova,  GETSUBJ  proceeds  to  initiate  the 
recognition  task  by  creating  the  flag  file  "IDCOM"  which 
directs  FACEFNDR  to  begin  this  task.  FACEFNDR  activates  the 
program  REMID,  by  program  swap,  which  performs  the 
comparison  of  the  window  gestalt  values  for  the  unknown 
facial  image  with  those  in  the  database. 

The  program  REMID  is  a  modified  version  of  the  original 
program  REM  (used  in  Russel's  thesis).  The  program  has  been 
modified  to  perform  in  a  non-stop  fashion  (it  no  longer 
requires  operator  interaction).  It  has  also  been  modified 
(by  Captain  Russel)  to  perform  in  an  interactive  display 
mode,  which  allows  the  results  of  the  recognition  process  to 
be  displayed  on  the  TV  monitor  at  the  Nova.  An  added  feature 
of  this  interactive  mode  is  the  use  of  a  DECTALK  speech 
synthesizer  which  announces  various  milestones  the  program 
is  about  to  accomplish.  At  the  end  of  the  recognition 
process  the  DECTALK  verbally  greets  the  recognized 
individual.  The  display  program  which  runs  concurrently  with 
REMID  (on  the  Nova)  is  the  program  NPR0C1.  Figure  3-12 
illustrates  the  display  generated  by  NPR0C1. 

The  basic  rules  of  the  selection  process  used  in 
recognizing  a  person  remain  unchanged  from  the  original 
version.  In  fact,  this  portion  of  the  code  (in  REMID) 
remains  unchanged  from  the  original  (the  program  REM).  The 
details  of  this  selection  process  are  fully  explained  in 


Russel's  thesis  (22:4-32)  and  to  a  somewhat  lesser  degree  in 
Appendix  A  (starting  at  page  A-7)  and  will  not  be  repeated 
here.  One  parameter  has  changed  slightly  from  the  original 
version.  In  the  original  version,  the  measure  of  how  certain 
the  computer  was  (of  the  correct  choice  for  the  unknown 
person)  was  indicated  by  a  number  which  represented  the  sum 
(for  all  six  windows)  of  each  window  distance  metric 
(discussed  in  Appendix  A,  page  A-8)  times  the  window 
performance  factor.  This  parameter  has  been  translated  to  a 
"pseudo-probability"  measure  by  multiplying  the  original  sum 
by  100,  and  dividing  that  product  by  the  sum  of  the  six 
window  performance  factors. 

If  more  than  one  face  has  been  found  in  the  original 
image,  the  above  process  is  repeated,  starting  again  with 
the  generation  of  the  facial  window  gestalts  for  the  next 
unknown  image.  This  continues  until  all  faces  that  were 
found  have  been  processed  and  identified.  As  the  recognition 
process  for  each  unknown  subject  is  completed,  the  user  is 
given  an  opportunity  to  print  the  image  on  the  line  printer. 
Printing  is  performed  by  the  Eclipse  system,  so  it  is 
necessary  to  use  FACEFNDR  to  print  the  image.  GETSUBJ 
initiates  the  print  operation  by  performing  a  program  swap 
with  the  program  SVPIC.  SVPIC  then  saves  the  image  on  the 
monitor,  to  disk,  and  generates  a  flag  file  "PRNTIMAGE".  The 
flag  file  causes  FACEFNDR  to  print  the  image  on  the  Eclipse 
printer. 
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V.  Test  Results 


As  in  the  previous  chapters,  this  chapter  is  divided 
into  two  main  sections.  The  first  section  examines  the  tests 
and  test  results  associated  with  the  face  locator  function 
of  the  AFRM  (Autonomous  Face  Recognition  Machine).  The 
recognition  performance  of  the  AFRM  is  then  examined  in  the 
second  section. 

Locating  Faces 

Two  types  of  images  were  used  to  test  the  face  locator 
portion  of  the  AFRM.  One  type  was  an  image  containing 
several  facial  images  pictured  against  a  pristine 
background.  This  type  of  image  is  illustrated  in  figures  5-1 
and  5-2.  Images  such  as  the  these  were  constructed  for  30 
different  subjects.  All  of  the  facial  Images  were  taken  from 
the  on-line  facial  database  (for  examples  see  appendix  D). 
The  individual  facial  images  used  were  all  obtained  under 
the  following  conditions: 

1.  The  F-stop  used  in  all  cases  was  F8.0. 

2.  Camera  zoom  was  adjusted  to  exactly  fit  the 
vertical  dimensions  of  the  head  (bottom  of  chin  to 
top  of  head)  within  a  64  by  64  pixel  image. 

3.  White  background. 

4.  Normal  laboratory  lighting  (overhead  fluorescent). 

5.  Subject  position  was  fixed  relative  to  the  video 
camera  (9  feet  in  front  of  the  camera). 

6.  Slight  variations  in  facial  expression  were  taken 
for  each  image  of  a  subject. 


The  multiple  facial  Image  pictures  shown  in  figures  5-1 
and  5-2  were  produced  by  overlaying  the  facial  images  from 
the  database  onto  a  white  background.  This  process  was 
performed  completely  by  the  computer,  using  the  program 
PICTURE2.  The  only  criterion  for  placement  of  the  facial 
images  was  that  the  eyes  must  appear  within  the  search  area 
(the  rectangular  box  indicated  in  figure  5-1).  Otherwise, 
the  placement  of  the  facial  image  was  arbitrary. 

Although  the  subjects  were,  for  the  most  part,  chosen  at 
random,  an  attempt  was  made  to  ensure  that  a  broad  variety 
of  facial  types  were  included.  All  but  two  of  the 
individuals  illustrated  in  appendix  D  were  included  in  this 
group.  The  two  that  were  excluded  are  shown  in  figure  D-3 
(reference  Appendix  D) ,  in  the  lower  right  and  lower  left 
corners.  They  were  excluded  because  the  majority  of  the 
images  contained  in  the  data  bank,  for  these  subjects,  were 
facial  images  taken  while  wearing  dark  rimmed  glasses. 
Glasses  in  general,  and  especially  dark  rimmed  glasses, 
significantly  distort  the  eye  signature.  Thus,  one  of  the 
pre-conditions  assumed  in  this  thesis  is  that  the  subject  is 
not  wearing  eye  glasses. 

The  total  number  of  different  facial  images  tested  in 
this  manner  was  139  (about  4  to  5  different  pictures  per 
subject).  Of  that  total,  131  were  correctly  located.  Of  the 
131  correctly  located,  only  two  showed  an  obvious  error  in 
the  determination  of  the  location  of  a  single  facial 
feature.  That  single  feature  was  the  location  of  the  mouth. 


Ia  these  iastaaces  the  mouth  was  erroneously  determined  as 
being  somewhere  between  the  top  of  the  nose  and  the  actual 
center  of  mouth.  The  remaining  feature  locations  were 
correctly  Identified.  This  performance  reflects  a  94%  hit 
(success)  rate  for  the  locator  algorithm. 

The  second  type  of  Image  used  to  test  the  face  locator 
function  was  one  In  which  the  background  was  not  pristine. 

An  example  of  this  type  of  Image  was  used  In  chapter  3 
(reference  figure  3-7).  The  purpose  of  this  test  was  not  so 
much  to  determine  the  hit  rate  for  finding  faces,  as  It  was 
to  determine  how  “face  specific"  the  face  locator  Is.  If  the 
algorithm  were  to  work  in  an  ideal  fashion.  It  would  be 
completely  (100%)  face  specific.  In  other  words,  any  object 
(or  collection  of  objects)  that  did  not  constitute  a  facial 
Image,  under  ideal  conditions,  would  never  be  confirmed  by 
the  algorithm  as  a  face.  Unfortunately,  there  is  apparently 
no  exact  metric  by  which  to  gauge  the  face  specificity  of 
the  algorithm.  Thus,  the  results  of  this  test  reflect  the 
subjective  judgement  of  the  investigator. 

Since  the  AFRM  system  was  designed  as  an  enhancement  to 
the  original  face  recognition  machine  (developed  by  Russel), 
with  the  goal  of  removing  as  much  reliance  on  the  operator 
as  possible,  it  was  decided  (as  a  first  cut)  to  test  the 
machine  with  background  conditions  which  occur  normally  In 
the  AFIT  Signal  Processing  Lab.  Normal  conditions  means  that 
the  operator  acquires  the  picture  of  a  subject  by  pointing 
the  camera  at  the  subject(s),  adjusts  the  zoom  so  that  the 


size  of  the  facial  image(s)  are  within  an  acceptable  range, 
and  snaps  the  picture.  The  operator  need  not  concern  himself 
about  the  background  or  the  exact  position  of  the  individual 
with  respect  to  the  camera.  The  assumptions  stated  in 
chapter  1  (Assumptions  section)  must  still  be  observed.  That 
is,  the  subject  is  looking  square  at  the  camera,  with 
minimal  rotation  or  tilt  of  the  face.  The  lighting  is  normal 
overhead  lighting,  and  the  subject  is  not  moving. 

Appendix  E  contains  several  examples  of  test  images 
taken  under  these  conditions.  These  images  are  presented 
first  as  the  unaltered  image,  followed  by  the  same  image  (in 
the  next  figure)  showing  the  results  of  the  AFRM  face 
locator  function.  In  each  case,  the  subjects,  the  number  of 
subjects,  the  size  of  the  facial  images  and  the  background 
were  varied  as  much  as  possible  (within  the  confines  of  the 
lab).  In  the  Images  which  show  the  results  of  the  face 
locator  analysis,  facial  images  are  marked,  as  ususal,  by  a 
grldwork  of  3  vertical  and  4  horizontal  dark  lines.  The 
search  area  is  marked  by  the  dark  boundaries  of  the  large 
box  cursor. 

Only  two  instances  of  false  alarms  occurred  in  these 
images.  One  instance  occurred  in  figure  E-5,  where  the 
computer  misinterpreted  the  dark  round  control  knobs  located 
above  and  to  the  right  (subject's  right)  of  the  subject. 
Another  occurred  in  figure  E-13,  where  some  background  noise 
located  to  the  right  (subject's  right)  of  the  neck  of  the 
taller  subject  was  interpreted  as  a  face.  Figures  E-26 


through  E-28  contain  no  faces,  and  were  designed  as  "acid" 
tests  to  see  if  any  false  faces  might  be  found.  In  no  case 
was  the  presence  of  a  face  confirmed  in  these  images.  The 
results  of  these  preliminary  (and  somewhat  limited)  noisy 
background  tests  and  the  previously  described  test  (94Z  hit 
rate  with  a  pristine  background),  indicate  that  the 
algorithm  used  for  locating  faces  has  a  reasonably  high 
specificity  for  faces. 

Concerning  the  test  images  present  in  appendix  E,  a  few 
images  indicate  that  a  face  was  not  found.  In  test  image  #1 
(figure  E-l,  Appendix  E),  the  condition  of  adjusting  the 
zoom  to  get  roughly  the  proper  size  was  deliberately 
violated.  In  this  case,  an  analysis  of  the  search  area  by 
the  face  locator  algorithm  yielded  the  result  that  no  face 
was  found.  This  result  is  expected  since  the  facial  images 
were  too  large  to  be  contained  within  a  64  by  64  image 
sub-section.  The  reader  will  recall  that,  once  a  facial 
image  has  been  located,  it  is  saved  as  a  64  by  64  image 
sub-section.  If  the  facial  image  is  too  large,  then 
information  necessary  for  the  face  recognition  process  would 
be  lost.  Thus  an  upper  limit  is  placed  on  the  size  of  the 
facial  image.  This  limit  is  governed  by  the  distance  (in 
pixel  columns)  between  the  locations  of  the  two  maxima  In 
the  eye  signature.  If  that  distance  is  greater  than  21 
pixels,  then  the  signature  is  not  recognized  as  an  eye 
signature.  Figure  5-3  illustrates  an  example  of  the  64  by  64 
image  obtained  when  the  program  is  operating  very  close  to 


Figure  5-3.  Example  of  Maximum  Face  Size, 
the  maximum  facial  size.  The  sub-image  illustrated  in  figure 
5-3  was  obtained  from  test  image  #8  (figure  E-I4).  The  same 
explanation  is  true  (the  facial  image  was  too  large)  for  the 
missed  face  in  figure  E-13. 

Figure  E-19  illustrates  the  case  where  the  facial  sizes 
are  just  at  the  lower  limit.  Only  one  subject  face  was 
confirmed  in  this  image.  Because  the  distance  between  the 
two  maxima  (from  the  eye  signature)  was  less  than  10  pixels 
for  the  other  two  subjects  (on  the  left),  these  faces  were 
not  located. 

One  further  test  was  done  to  observe  the  affect  of 
using  only  an  edge  enhanced  facial  Image.  Figure  5-4 
illustrates  such  an  image.  This  facial  Image  was  obtained  by 
photographing  the  Image  directly  from  a  book  (9:167).  In 
photographing  the  image,  It  was  necessary  to  use  a  very 
small  aperture  (F16.0)  to  obtain  reasonable  resolution.  Use 
of  the  v lde o/d lgl t 1 zer  system  under  these  conditions 


generates  considerable  noise  in  the  image.  This  is 
illustrated  in  figure  5-4.  Both  images  in  figure  5-4  are 
pictures  of  the  same  original  image  taken  at  different 
times.  In  one  case  (for  the  left  image),  the  face  locator 
was  successful  in  finding  the  face.  However,  for  the  image 
on  the  right,  the  face  locator  was  unable  to  find  the  face. 
The  reason  for  this  failure  was  investigated  by  obtaining 
and  analyzing  the  eye  signature  for  the  facial  image  on  the 
right.  The  eye  signature  is  illustrated  in  figure  5-5  (upper 
right  corner).  Examination  of  the  eye  signature  readily 
indicates  why  the  algorithm  failed  in  this  case.  There  is  a 
small  peak  at  the  very  center  of  the  signature  due  to  the 
noise  in  the  image.  The  smoothing  technique  used  prior  to 
analysis  of  the  eye  signature  is  not  sufficient  to  remove 
this  small  peak.  Thus,  when  searching  for  two  successive 
peaks,  the  algorithm  will  erroneously  identify  this  small 
central  peak  as  the  second  peak.  This  results  in  a  failure 
to  meet  the  eye  signature  criteria  and  the  search  fails. 

This  type  of  problem  did  not  occur  with  images  obtained 
with  larger  apertures  (F8.0).  The  problem  appears  to  be  due 
to  discretization  error  within  the  system  which  is  much  more 
marked  when  operating  at  small  apertures.  Although  the 
average  system  in  use  would  not  demonstrate  such  noisy 
behavior,  the  answer  to  this  problem  is  to  filter  out  as 
much  of  the  noise  as  possible  as  part  of  a  pre-processing 
step.  Alternatively,  when  searching  for  the  eye  pattern  in 
the  original  image,  a  darker  "background  fill  intensity"  (as 


discussed  In  chapter  3,  page  3-11)  might  be  used  to  suppress 
such  noise. 

Recognl zing  Faces 

Using  a  database  of  twenty  individuals,  the  system  was 
tested  for  recognition  performance.  The  system  was  trained 
with  an  average  of  four  facial  images  per  subject.  A  fifth 
image  was  held  in  reserve  (the  machine  was  not  trained  with 
this  image)  for  testing.  All  images  were  acquired  using  the 
face  locator  algorithm,  and  windowed  in  the  manner  described 
in  chapter  3.  Two  trials  were  conducted  using  the  same 
database  of  20  people.  On  the  second  trial,  those  images 
that  had  been  used  (on  the  first  trial)  as  test  images,  were 
used  as  training  images.  An  image  (for  each  subject)  that 
had  previously  been  used  for  training,  was  then  used  to  test 
the  system.  For  example,  if  the  system  had  been  trained  on 
pictures  Smithl.PI  through  Smith4.pl,  and  tested  with 
picture  Smlth5.pl  in  trial  #1,  then  in  trial  #2  it  would  be 
trained  on  Smlth2.PI  through  Smith5.PI  and  tested  on  picture 
Smithl.PI.  The  recognition  performance  results  that  were 
obtained  are  shown  in  Table  5-1. 

Also  shown  in  Table  5-1  (for  comparison),  are  Russel's 
original  results.  The  metric  "average  reduction  in 
uncertainty”  or  ”F”  was  determined  as  follows: 

M 

F  -  1  -  (1/M)^r  (Si-D/N 


(5-1) 


where  SI*  number  of  individuals  the  correct 


person  is  down  from  the  top  of  an 
ordered  list  of  candidates. 

i  *  number  of  the  particular  individual  in 
the  database  who  is  being  processed  for 
recogni tion. 

N  *  total  number  of  individuals  in  the 
da  tabase . 

N  *  number  of  individuals  for  which  the 
recognition  system  was  tested. 

Equation  5-1  is  identical  to  that  used  by  Russel 
(22:6-8).  The  average  reduction  in  uncertainty  is  a  more 
accurate  measure  (than  the  percent  absolute  correct)  of  the 
machine's  performance  since  it  takes  into  account  how  close 
(in  rank  order)  the  correct  individual  was  to  the  top 
candidate. 

Although  the  recognition  performance  was  relatively 
poor  when  gauged  in  terms  of  percent  absolute  correct,  the 
figures  obtained  for  the  average  reduction  in  uncertainty 
were  respectably  high.  In  both  tests,  the  correct  individual 
was  in  the  top  3  candidates,  in  18  out  of  20  cases.  These 
results  look  very  promising,  especially  considering  that  the 
choice  of  facial  gestalt  windows  was  done  rather  hastily  and 
was  not  based  on  a  thorough  analysis  of  all  possible 


windows . 


Russel' 

s  Results: 

Number 

in  database 

• 

• 

20 

Number 

recognized 

as 

1st 

Choice:  18 

Number 

recognized 

as 

2nd 

Choice :  1 

Number 

recogni zed 

a  s 

3rd 

Choice:  1 

Absolute  Correctness 

-  0 

.90 

Average 

Reduc  tion 

i  n 

Uncertainty  -  0.9925 

AFRN  Results: 

Trial 

LL: 

Number 

in  database 

• 

• 

20 

Number 

recognized 

as 

1st 

Choice  : 

12 

Number 

recognized 

as 

2nd 

Choice : 

4 

Number 

recogni zed 

as 

3rd 

Choice : 

2 

Number 

recognized 

as 

5th 

Choice : 

1 

Number 

recognized 

as 

8th 

Choice : 

1 

Absolute  Correctness 

*  0 . 

60 

Average  Reduction 

in 

Uncertainty 

-  0.9525 

Trial 

12: 

Number 

in  database 

• 

• 

20 

Number 

recognized 

as 

1st 

Choice : 

10 

Number 

recognized 

as 

2nd 

Choice : 

5 

Number 

recognized 

as 

3rd 

Choice : 

3 

Number 

recognized 

as 

5th 

Choice : 

1 

Numbe  r 

recogni zed 

as 

11th  Choice 

:  1 

Absolute  Correctness 

»  0 . 

50 

Average  Reduction 

in 

Uncertainty 

-  0.9375 

Table  5-1.  Test  Results  for  Recognition 


V I .  Conclusions  and  Recommendations 

Conclus Ions 

In  an  attempt  to  combine  the  face  recognition 
capabilities  of  Russel's  Face  Recognition  Machine,  with  the 
additional  capability  of  automated  scene  analysis  for  faces 
in  a  digital  image,  the  Autonomous  Face  Recognition  Machine 
( AFRM )  has  been  developed.  The  question  that  needed  to  be 
answered  was:  "Can  a  machine,  entirely  on  its  own,  determine 
whether  or  not  a  person's  face  is  in  a  picture,  and  if  so, 
can  it  determine  to  whom  the  face  belongs?".  The  results  of 
this  thesis  demonstrate  very  clearly  that  the  answer  is  yes, 
on  both  counts.  The  results  also  indicate  that  much  more 
development  needs  to  be  accomplished  before  such  a  machine 
becomes  practical. 

In  terms  of  its  ability  to  analyze  a  digitized  scene 
for  the  presence  of  faces,  the  AFRM  demonstrated  a  high 
success  rate  of  94Z.  For  the  remaining  6Z  (representing  8  of 
the  139  facial  images  tested),  the  machine  initially  zeroed 
in  oa  the  face  in  the  large  majority  of  cases.  It  did  not 
confirm  the  presence  of  these  faces  for  various  reasons.  One 
such  reason  is  that  the  contrast  expansion  of  the  original 
image  was  too  great  due  to  dark  hair  at  or  below  the  eyebrow 
level.  Another  reason  is  that  the  location  of  various 
features  were  outside  the  expected  range  by  one  or  two 
pixels.  These  problems  should  be  relatively  easy  to  overcome 
by  relaxing  and  adjusting  the  constraints  of  the  face 
locator.  Although  difficult  to  quantify,  the  AFRM's  ability 


Co  distinguish  between  faces  and  other  classes  of  objects 
(it's  face  specificity)  appears  to  be  quite  good.  False 
confirmation  of  a  facial  image  occurred  in  only  two 
instances  out  of  17  test  images. 

While  the  face  locator  is  relatively  Immune  to 
variations  in  translation  and  scale  of  the  facial  image,  it 
is  susceptable  to  rotation  of  the  facial  image.  Small 
degrees  of  tilting  or  turning  of  the  head  have  no  ill 
effect.  But  larger  degrees  of  rotation  (more  than  5  to  10 
degrees)  will  cause  the  face  not  to  be  found. 

The  face  locator  is  also  relatively  Immune  to 
electronic  equipment  noise.  The  case  demonstrated  in  figures 
5-4  and  5-5  is  an  extreme  case  where  there  is  a  very  high 
degree  of  discretization  noise  at  intensity  levels  low 
enough  to  affect  the  eye  signature.  Normally,  it  is  not 
envisioned  that  the  machine  would  be  required  to  work  with 
such  a  poor  image.  The  algorithm  could  be  adjusted  to 
compensate  for  such  a  problem. 

The  AFRM's  ability  to  recognize  an  individual  was 
significantly  reduced  when  compared  with  Russel's  original 
results  for  absolute  correctness.  This  is  not  surprising 
when  one  considers  that  the  recognition  scheme  used  in  the 
AFRM  is  based  solely  on  the  internal  features  of  the  face. 
Unlike  the  original  (Russel's)  machine,  the  AFRM  does  not 
have  available  to  it  the  Identity  specific  information 
contained  in  the  outer  bounds  of  face  and  the  hair  style. 
However,  when  compared  to  Russel's  results  for  the  average 


reduction  In  uncertainty,  the  machine  compared  quite 
favorably.  It  Is  highly  probable  that,  through  a  more 
judicious  choice  of  windows,  the  recognition  performance  can 
be  brought  up  to  the  level  achieved  by  Russel. 
Recommendations 

Considering  the  relatively  crude  method  used  to  analyze 
the  gestalt  pattern  for  the  eye  signature  (point  by  point 
analysis  of  the  curve),  and  the  very  limited  criteria  for 
the  eye  signature,  the  high  performance  rate  of  the  face 
locator  function  Is  very  encouraging.  A  significant 
improvement  to  the  algorithm  would  be  to  first  do  a  least 
squares  fit  of  the  gestalt  pattern  data  to  a  set  of 
polynomial  coefficients.  Alternatively,  one  might  also  fit 
the  data  to  a  set  of  Fourier  coefficients.  In  either  case, 
this  type  of  transformation  would  facilitate  a  more  reliable 
and  consistent  analysis  for  the  eye  signature.  In  addition 
to  alleviating  such  noise  problems  as  that  Indicated  In 
figure  5-5,  an  extra  bonus  might  be  obtained.  The  bonus 
would  be  a  further  means  of  identifying  the  face.  Once  the 
Fourier  or  polynomial  coefficients  have  been  obtained  for 
the  eye  signature  curve,  It  may  be  possible  to  identify  an 
individual  (at  least  In  part)  by  comparing  these 
coefficients  with  those  of  previously  obtained  eye 
signatures.  In  addition  to  the  eye  signature,  the 
application  of  this  technique  to  the  nose/mouth  signature 
should  also  be  examined. 
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ahstbaCT:  A  face  recognition  system  was 
developed  at  the  Air  force  Institute  of 
Technology ,  based  on  the  principles  of 
Cortical  Thought  Theory  (CTT) ,  proposed 
by  Dr.  Richard  Routh  la  July  1985  as  his 
doctoral  dissertation.  CTT  Is  as  Initial 
attempt  at  a  unified  brain  theory.  Tba 
CTT  “gestalt*  transformation  naps  a 
2 -dimensional  Image  Into  a  2-D  coordinate 
point.  The  face  recognition  system 
extracts  six  sub- I mages  from  a  contrast- 
expanded  Image,  calculates  the  2-D 
gestalt  coordinates,  and  stores  the 
Information  In  a  database.  Statistics 
are  then  calculated  on  at  least  five 
prototypes  processed  for  each  person.  An 
“unidentified*  person  la  recognised  by 
calculating  the  six  gestalt  feature 
vectors,  and  finding  the  closest  match  to 
previously  stored  data.  Performance 
testing  of  the  system  yielded  s  reliabi¬ 
lity  of  90S  for  a  database  of  20  people. 


I.  Introduction. 

A  face  recognition  system  was 
developed,  based  on  the  principles  of 
Cortical  Thought  Theory  (CTT),  recently 
published  by  Dr.  Richard  L.  Routh  in  July 
1985  as  his  doctoral  dissertation  at  the 
Air  force  Institute  of  Technology  (4). 

CTT  claims  to  be  a  generic  model  for 
sensory  Information  analysis,  regardlass 
of  the  domain  or  entry  level  of 
abstraction.  Routh  tested  the  CTT 
architecture  successfully  for  speech 
processing.  In  order  to  test  this 
architecture  as  a  generic  model,  CTT  was 
tested  for  visual  processing, 
specifically  for  the  difficult  task  of 
human  face  recognition. 


Since  the  purpose  of  this  research 
was  to  apply  Cortical  Thought  Theory  to 
the  domain  of  vision,  it  would  be 
instructive  to  review  the  major  concepts 
of  this  theory.  For  years,  those 
Involved  la  Artificial  Intelligence  (AI) 
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have  tried  to  model  human  thinking  by 
using  logla  and  other  deductive 
processes.  They  have  enjoyed 
considerable  succesa  la  many  areas,  but 
computer  systems  still  have  great 
difficulty  reproducing  what  we  call 
“insight*,  or  tsking  two  pieces  of 
information  and  inducing  a  new 
association.  Another  problem  with 
conventional  AI  systems  Is  that  the 
search  time  Increases  exponentially  with 
the  slse  of  the  knowledge  base. 

Rather  than  starting  with  basic 
operations  (primitives)  using  deduction 
(a  concept  well-established  la  AI),  Routh 
approached  the  problem  by  starting  with 
primitives  of  Induction.  His  theory 
proposes  that  information  is  displayed  as 
a  two-dimaasloe&l .  image  on  the  cortex 
surface  of  the  brain.  Then  the  cortex 
must  extract  a  two-dimensional  vector 
from  the  Image,  which  he  referred  to  as 
the  “gestalt*  of  the  image.  Be 
maintained  that  the  dimension  of  the 
gestalt  feature  vector  set  must  be  “two*. 
This  type  of  representation  allows  direct 
aumory  access,  which  aeens  basically  no 
increase  la  search  time,  even  with  any 
increase  in  slse  of  the  knowledge  base. 
This  2-D  vector  is  all  that  is  passed  up 
to  the  next  level  of  abstraction.  Be 
explains  this  as  follows: 

By  using  the  experimental 
results  obtained  from  the 
perceptual  psychology 
investigations  into  the  nature 
of  the  human  gestalt  meohsnl  am 
by  Eabrlaky,  Maher,  Gina burg. 

Pan tie,  and  Sekuler  (among 
bthars),  it  was  argued  that  the 
two  element  gestalt  vector  is 
probably  extracted  from  some  low 
pass  two-dimensional  spatial 
frequency  domain  represents bion 
of  the  2-D  input  image. 

But  what  spatial  frequency 
domain  representation  was  to  be 
used?  Several  methods  of 
displaying  the  low-frequency 


spatial  harmonics  of  a  2D-DFT 
ware  investigated  so  as  to  find 
a  single  identifying  2-space 
vector  characteristic  which 
could  be  called  a  "gestalt’* . 

The  method  had  to  suppress  the 
D.C.  value  which  did  not  contain 
useful  information  for 
identification . 

It  also  had  to  deal  with  how 
to  present  both  sine  and  cosine 
components  of  a  2D-DFT  on  a 
2-dimensional  surface.  It  was 
observed  that  if  the 
Two-Dimensional -Discrete  Fourier 
Sine  Transform  (2D-DFST)  was 
used  (instead  of  the  2D-DJT) , and 


Origin 


if  the  technique  of  zero-filling 
was  used  to  produce  sub- integral 
harmonics,  a  "hump"  was  usually 
observed  between  the  zeroeth  and 
the  first  harmonic.  The 
location  of  the  peak  of  this 
hump  could  easily  represent  the 
gestalt  value  since  it  can  be 
represented  by  a  two- apace 
vector,  and  it  changes  location 
for  different  input  images  (see 
figure  1 ) .  experiments 
suggested  that  it  was  sufficient 
to  aranl ns  the  1 /64th  harmonics 
between,  zero  and  one.  The 
2D-DFST  gestalt  mechanism  is 
specified  by  the  following 
equations: 
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where  GFVS=(i,J)  la  the 
two-space  vector  identifying  the 
location  of  the  gestalt  on  the 
next  higher  (in  the  hierarchy  of 
abstraction)  local  cortex 
surface. 

It  was  shown  that  the  level 
one  neurons  of  the  cortex  could 
easily  perform  a  very  good 
approximation  to  the  2D-DFST 
fro*  the  seroeth  to  the  first 
harmonic.  There  would  be  an 
error  between  the  true  2D-DF5T 
and  the  cortex  transform,  but 
the  cortex  transform  still 
preserves  the  important 
characteristic:  it  produces  a 

'hump'  whose  peak  moves  in 
relation  to  the  human-perceived 
difference  in  the  input  images. 
The  gestalt  would  be  the 
two-space  location  of  the 
cortical  column  located  at  the 
highest  amplitude  point  (5). 


The  transform  used  to  simulate  this 
process  Is  as  follows: 

<llYM-Ehn  fliacretfl  Input  Imams  ■ 
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where  <r  -  0.435  . 


(In  this  systen,  0  <=  Mkh  <=  15.  "0" 

signifies  a  'white*  pixel,  and  *15" 
signifies  a  'black'  pixel,  with  varying 
grey  scale  values  in  between. ) 

The  above  'cortex*  model  of  the 
transform  is  the  basis  of  the  feature 
vector  used  in  the  CTT  Face  Recognition 
Systen.  South’s  theory  also  embraces  the 
work  of  Or.  Leslie  Goldschlager  from  the 
Oni varsity  of  Sydney  in  Australia  ( 1 ) . 
Goldschlager,  studying  brain  theory  on  an 
independent  course  from  South,  explains 
how  a  local  cortex  surface  could 
reasonably  perform  the  operations  of  set 
completion  and  sequence  completion  (1,4). 
Set  conflation  is  an  operation  in  which 
all  ■>*«  of  a  set  are  retrieved,  given 
a  unique  subset*.  This  characteristic  may 
explain  such  phenomena  as  recalling  many 
things  about  a  person  seemingly 
simultaneously,  given  only  the  person’s 
name.  Sequence  completion  embodies  the 
AI  concept  of  scripts,  in  which  points 
are  stored  in  the  order  in  which  they 
occur.  Given  a  unique  subset  of  these 
points  in  the  right  order,  sequence 
coatpletion  will  retrieve  the  rest  of  the 
points  in  the  sequence. 

fTnmht  n  1  ug  the  retrieval 
characteristics  of  set  completion  and 
sequence  completion  with  South’s  gestalt 
mechanism.  South  proposed  a  model  of  a 
complete  human  reasoning  system  (4,5). 

III.  CTT  Ylalan  Modal. 

The  first  step  in  designing  a  vision 
system  baaed  on  CTT  is  to  examine  the 
general  requirements  which  CTT  outlines 
for  a  human-like  information  processing 
system  (see  figure  2): 

1 )  Display  the  information  as  a 
2-dimensional  image. 

2)  Define  the  proper  boundaries,  or 
'windows*,  on  the  image. 

3)  Extract  different  sub- looks,  or 
'sub-windows',  from  the  image . 

4)  Calculate  the  gestalt  of  the 
different  sub-looks. 

5)  Display  the  gestalts  from  all  the 
windows  as  points  in  a  sew  image. 

6)  Apply  'set  completion'  to  find 
the  set  of  previously-seen  points  to 
which  this  new  set  maps  to  most  closely. 

7)  Find  the  gestalt  of  this  new 
Image.  This  will  be  displayed  as  a 
single  point  on  a  3rd  level  of 
abstraction,  and  is  the  'name"  of  the 
original  image. 

This  results  In  a  surface  displaying 
the  names  of  all  the  images  the  system 
has  seen.  To  "recognize"  an  image,  the 
system  would  calculate  its  2-dimenslonal 
gestalt  coordinates.  The  "name"  of  the 
image  is  then  whichever  previously  stored 


point  on  tba  “nunc  surface"  to  which  the 
$  coordinates  of  the  unidentified  image  are 
closest. 

South  constructed  a  limited  speech 
recognition  system  by  calculating  the 
gestalt  coordinates  of  time  slices  of 
speech  signals  which  were  displayed  in 
the  log-amplitude  by  log-frequency  format 
in  which  audio  information  is  presented 
to  the  brain's  primary  audio  cortex.  As 
an  initial  test,  the  64n64  primary  audio 
cortex  map  was  removed  from  South's 
speech  system,  and  in  its  place  was 
inserted  a  64x64,  sixteen  gray  level, 
digitized  image  of  a  human  face.  This 
analysis  was  applied  to  five  images  each 
of  sixteen  different  people.  The  results 
indicated  that  human  faces  can  be 
classified  and  distinguished  with  the  CTT 
model,  and  the  2-D  CTT  napping  (or 
■gestalt" )  of  the  faces  is 
psychologically  similar  to  the  way  a 
human  mould  group  them. 


During  the  course  of  the  research,  it 
became  evident  that  to  get  a  better 
separation  between  individuals,  the 
system  needed  to  look  at  several 
sub-parts  of  the  face.  The  whole  face 
gestalt  provided  useful,  but  not 
sufficient  information.  The  following 
additional  processing  steps  mere  added: 

A.  Contrast  enhancement.  When 
initially  taking  pictures  and  processing 
gestalts,  the  effect  of  lighting  and 
f-stop  was  evaluated.  It  was  found  that 
the  best  separation  came  between  pictures 
of  different  people  came  from  a 
high-contrast  image  in  which  facial  lines 
are  bleached  out  (for  the  most  part)  and 
hair,  eyes,  nose  and  mouth  appear  as  dark 
blobs.  The  person  is  usually  still 
recognizable  in  this  form.  Pictures  were 
taken  at  an  f-stop  of  F8  (where  all  head 
boundaries  were  still  visible  to  the 
human  operator  and  computer) ,  the 
computer  extracted  boundary  information 
from  this  picture,  and  then  artificially 
expanded  the  contrast.  (See  figure  3, 
top  center  picture. ) 

B.  feature  Location.  Using  this 
contrast-expanded  image,  the  system 
estimates  locations  of  the  major  features 
on  the  face,  and  displays  them  on  the 
screen  (see  figure  3,  top  right  picture.) 
The  user  can  at  this  point  readjust  the 
feature  locations  if  the  computer  chose 
them  Incorrectly.  The  computer  will  then 
redisplay  the  changed  values. 

The  half-face  representation  was  used 
for  a  special  reason.  It  was  found  that 
the  gestalt  transform  tends  to  find  the 
**  center  of  mass  on  an  image.  Qiven  this. 


then  the  transform  is  not  sensitive  to 
aspect  ratio  with  the  head  centered  in 
the  transform  window.  Since  a  face  is 
basically  vertically  symmetrical,  then  a 
wide  face  will  give  the  same  gestalt  in 
the  X  direction  as  a  thin  face. 
Unfortunately,  people  tend  to  be  quite 
aware  of  aspect  ratio  when  recognizing 
someone  (determined  by  an  informal  survey 
by  one  of  the  authors. ) 

To  handle  this  problem,  it  was 
necessary  to  divide  the  image  down  the 
center,  display  the  halves  as  two 
separate  images,  and  take  the  gestalts  of 
the  separate  images  (see  figure  3. )  Now 
changes  in  aspect  ratio  are  reflected  as 
changes  in  the  X  direction  of  the 
gestalt. 

Wanting  to  be  consistent  with  CTT  and 
the  physiology,  this  split-image 
requirement  was  found  to  be  a  strange 
restriction  of  the  presentation  of  a 
facial  image.  Then  it  was  realized  that 
the  primate  visual  system  also  splits 
images  vertically  down  the  center  before 
displaying  them  on  separate  left  and 
right  primary  visual  cortexes  (2).  The 
reasons  for  the  partial  splitting,  (or 
■decussation’)  of  the  visual  pathway  at 
the  optic  chiasm  are  not  well  understood, 
and  attempted  explanations  for  the 
phenomenon  quickly  become  complex  and 
convoluted.  It  is  significant  that 
Cortical  Thought  Theory  provides  a 
possible  explanation  which  is  simple, 
straightforward,  and  is  a  natural 
requirement  of  the  theory. 

C.  Window  Extraction.  Six  different 
sub-windows  on  the  face  were  extracted. 
(See  figure  3,  bottom  six  pictures. ) 

D.  Gestalt  Calculation.  For  each  of 
the  six  windows,  a  two-dimensional 
gestalt  coordinate  is  calculated, 
transformed  for  scale,  and  displayed 
above  the  sub- image  for  which  it  was 
calculated  (see  figure  3. ) 

E.  Storage  in  Database.  Once  the 
gestalts  are  calculated  for  all  six 
windows  on  the  face,  ail  the  data  for 
this  picture  is  put  together  as  a  record 
in  the  Processed  Picture  Database.  This 
process  is  repeated  for  each  picture. 

When  all  the  pictures  are  entered  for  an 
individual,  the  system  is  ready  to  be 
"trained*  with  the  data. 

F.  Training  the  Database.  Training 
is  done  by  characterizing  an  individual 
by  the  X  A  Y  mean  and  standard  deviations 
of  gestalt  values  over  a  number  of 
pictures.  In  this  way  the  system  has  an 
idea  of  a  reasonable  range  of  values  to 
expect  for  a  given  individual .  For  this 
study,  five  pictures  were  taken  of  each 


B 


person  for  training.  Tho  authors 
realised  that  scores  of  pictures  taken 
over  a  period  of  tiae  (say,  a  year)  would 
be  desirable  to  thoroughly  test  the 
system.  However,  tiae  constraints 
prevented  this.  It  was  assuaed  the  five 
pictures  would  get  us  "In  the  ballpark,*’ 


and  X  4  Y  standard  deviations  for  a 
person  stored  at  the  2-D  coordinate 
location  indicated  by  the  person’s 
average  X  4  Y  gestalt  values.  Once  the 
coordinate  database  has  been  trained,  it 
ready  to  "recognise''  an  individual. 
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standard  daviatioas )  fro*  the  coordinate 
value*  of  the  unidentified  person.  Those 
within  the  range  are  selected  as 
candidates  for  further  processing,  while 
thsoe  outside  this  range  are  rejected. 
This  is  in  tern*  of  the  different 
standard  deviation  values  associated  with 
each  individual.  For  instance,  assune 
(for  just  on*  diaension)  the  naan 
coordinate  values  for  Mika  Jones  and  Joe 
Smith  are  the  sane  distance  away  (let's 
say  2.0)  fron  the  values  for  the  unknown 
person.  However,  Hike's  standard 
deviation  value  is  0.5,  and  Joe's 
standard  deviation  is  2.0.  Therefore, 
Hike  Jones  is  4.0  standard  deviations 
fron  the  unknown  person,  and  will  not  be 
selected  as  a  candidate.  However,  Joe 
Saith  is  1.0  standard  deviations  away, 
and  is  selected  as  a  candidate. 

2)  Distance  Heasure.  A  distance 
is  calculated  fron  the  unknown  person's 
values  to  stored  values  for  each  of  the 
six  individual  windows.  The  distance 
aeasure  for  each  individual  window  is: 


-i  ff<h.  -  ♦  (flu 
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where  1  =  nunber  of  individual  in 
database  being  considered. 


processed. 


w  =  nunber  of  window  being 


Gii.  Gi»  s  X.Y  coordinate 
values  of  previously  stored  candidate. 

Cm* ,  Gay  -  X.Y  coordinate 
values  for  an  unidentified  person,  and 

fft*.  y  3  X,Y  standard 
deviations  for  person  "l". 


3)  Weighting  by  Window 
Performance .  As  nentloned  above,  each 
window,  or  sub-look,  has  its  own 
database,  and  a  distance  measure  is 
calculated  for  each  window.  Bowever, 
when  conbining  values  fron  all  the 
windows,  should  all  windows  hold  equal 
weight?  Blaine  Rich,  in  her  book 
Artificial  Intelligence .  points  out  that 
the  weighting  function  should  take  into 
account  the  ’confidence  in  the  evidence " 
(3).  In  this  application,  the 
’confidence’  is  how  well  the  particular 
window  discriminates  between  individuals, 
and  is  referred  to  in  this  work  as 
.  "performance  factors.*  A  performance 
factor  is  calculated  as  follows: 


(Average  standard  deviation  of  the  mean 
of  (x.y)  gestalt  values)/  (Average  of  the 
standard  deviations  for  all  (x,y) 
gestalts).  (6) 


Combining  the  x  and  y  weightings,  we  have 


/(Pv.a  ♦  P*y») 


where  Fa  is  (  performance  factor 
which  indicates  the  ability  of  the 
particular  window  to  discriminate  between 
individuals . 


4)  Final  Recognition  List.  By 
repeating  this  process  on  all  six 
windows ,  summing  the  values  for  each 
window,  and  sorting  them,  the  result  is  a 
list  ordered  fron  the  most-likely 
candidates  to  the  least- likely. 


Ti  =  £  (P*  *  Vi«). 
0=1 


where  Ti  -  list  of  total  values  for 
individuals  for  all  windows, 

At  3  Performance  factor 
weighting  for  window  w,  and 


▼l  =  Value  of  individual 


window  W. 


V.  TESTING 

The  system  was  trained  with  fron  4  to 
9  pictures  each  of  20  individuals.  One 
image  for  each  individual  was  used  to 
test  the  system.  (This  picture  was  not 
included  in  the  training  set. ) 

▼I.  RESULTS . 

A.  Recognition  Performance. 

1)  The  overall  recognition 
results  obtained  are  shown  in  table  1 : 


Nunber  in  database:  20 

Number  recognized  as  1st  choice:  18 

Number  recognized  as  2nd  choice:  1 

Number  recognized  as  3rd  choice:  1 

Absolute  Correctness  =0.90 
Average  Rank  Order  =  .9925  (with 
1.00  being  an  average  of  >1  out  of  20. ) 


Table  1.  Test  Results  for  Recognition 
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2)  Performance  of  the  Individual 
Windows .  This  performance  is  shown  in 
table  2: 


Individual 

Absolute 

Average  Rank 

Window 

Correct 

.  1 

0.50 

0.915 

2 

0.75 

0.983 

3 

0.60 

0.870 

4 

0.35 

0.933 

5 

0.30 

0.823 

6 

0.55 

0.958 

All  Combined 

0.90 

0.993 

Table  2.  Test  Basalts  for  Individual 
Windows 


The  data  indicates  that  the  performance 
when  combined  is  much  greater  than  when 
taken  individually .  It  also  shows  that 
although  the  individual  windows  had 
relatively  low  performance  as  far  as 
absolute  correctness,  the  correct  answer 
was  usually  close  to  the  top,  as 
indicated  by  the  Average  Bank  Order. 

3)  Performance  of  Multiple 
Windows.  In  order  to  find  the  effect  of 
&  incrementally  adding  additional  windows 
pf  to  the  system,  the  recognition  data  was 
recalculated  as  the  number  of  windows  was 
Increased  from  1  to  6. 


Windows 

Used 

Absolutely 

Correct 

Average  Rank 
Order 

1 

0.50 

0.910 

1.8 

0.65 

0.930 

1.6.2 

0.70 

0.980 

1.8. 2. 3 

0.35 

0.988 

1.6, 2. 3. 4 

0.90 

0.993 

1.6. 2. 3. 4. 5 

0.90 

0.993 

Table  3.  Recognition  Results  from 

Combining  Multiple  Windows 


B.  Other  Results  Noted  during 
Testing. 

1)  The  system  will  identify  a 
face  with  only  part tally- recognised 
facial  images.  In  many  cases,  an 
individual  did  not  even  appear  as  a 
candidate  in  one  or  two  windows,  but  was 
still  identified  as  a  result  of  strong 
performance  in  the  other  windows.  The 
system  was  determined  to  provide  a 
reasonable  engineering  approximation  to 
•“  the  Ooldachlager  set  completion  process. 
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2)  Gestalt  calculations  of 
negative  images  gave  little  separation, 
as  the  system  paid  more  attention  to  the 
skin  than  the  hair  or  features.  This  is 
because  the  system  only  works  where 
black-colored  pixels  have  the  high-energy 
content.  Humans  also  seem  to  have 
problems  recognising  negative  images.  If 
humans  are  edging  or  cartooning  the 
image,  as  some  researchers  suggest,  then 
a  negative  image  would  give  the  same 
result  as  a  positive  one.  Since  a  human 
is  indeed  sensitive  to  negative  images, 
the  CTT  model  may  provide  a  possible 
explanation  of  why  this  is  so. 

3)  The  performance  of  the 
individual  windows  in  the  face 
recognition  system  provided  a  reasonably 
accurate  model  of  human  recognition 
performance  for  the  same  sub-parts  of  the 
face  ( 8 ) . 

VII.  SUMMARY. 

The  CTT  Face  Recognition  System  was 
performance  tested  with  a  database  of  20 
people.  The  following  are  some  of  the 
significant  results: 

1)  It  identified  the  correct 
person  as  1st  choice  90. 0X  of  the  time, 
and  the  Average  Rank  Order  was  99.25X. 

2)  Provides  an  explanation  of 
why  the  primate  visual  system  splits 
image*  vertically  before  displaying  them 
on  separate  right  and  left  primary  visual 
cortexes. 

3)  Highly  suggests  that  the 
gestalt  operation,  as  proposed  by  CTT, 
can  indeed  provide  high-performance  fora 
recognition  when  it  is  coupled  with  the 
use  of  multiple  windows  on  an  image. 

This  is  a  result  predicted  by  CTT  and 
borne  out  in  this  research. 

The  performance  of  the  face 
recognition  system  strongly  suggests 
CTT’s  general  applicability  to  vision, 
and  Increases  its  credibility  as  a 
general  model  of  human  sensory 
information  processing.  The  conclusion 
of  this  research  is  that  Cortical  Thought 
Theory  is  a  promising  new  architecture 
with  demonstrated  effectiveness,  worth 
increased  research  and  development  by 
those  Interested  in  developing  computing 
systems  with  human- like  sensory 
Information  processing  capabilities. 
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Owing  to  the  autonomous  nature  of  the  machine,  use  of 
the  Autonomous  Face  Recognition  Machine  (AFRM)  is  a 
relatively  simple  task.  Once  a  digital  image  has  been 
acquired,  the  operator  initiates  analysis  of  the  image  at 
the  press  of  a  button.  After  initiating  the  analysis,  the 
operator's  role  is  limited  to  deciding  what  to  do  with  the 
various  results  of  the  analysis.  The  discussion  in  this 
guide  follows  the  normal  flow  of  processing  that  would 
usually  occur.  The  special  function  of  generating  Images 
that  are  used  specifically  for  training  the  AFRM  for 
recognition,  is  discussed  at  the  end  of  this  user's  guide. 

A  complete  listing  of  all  necessary  directories,  programs, 
links  and  data  files  is  given  at  the  beginning  of  appendix  F 
of  this  thesis. 

Image  Acquisition 

To  begin  the  analysis  of  any  digital  image,  the  desired 
image  must  be  loaded  in  the  Octek  Image  Analyzer  Card  (IAC). 
There  are  basically  two  ways  to  do  this.  One  way  is  to  load 
a  previously  saved  image  from  memory.  The  other  way  is  to 
load  the  image  currently  being  input  to  the  video  camera. 

ACQUIRING  AN  IMAGE  PRIOR  TO  AFRM  STARTUP 

If  the  user  desires  to  load  an  image  which  has 
previously  been  saved  in  memory  and  the  user  does  not  recall 
the  name  of  the  image  file,  or  the  user  is  not  sure  If  the 
image  file  is  present  in  the  current  directory 


(NSMITH),  then  the  following  procedure  should  be  used.  This 
procedure  should  be  used  before  activating  the  AFRM. 

To  examine  which  image  files  are  in  the  current 
directory,  type  the  following  at  the  Nova  terminal  (while  in 
directory  "NSMITH") : 

LIST/A/S  -.VD 

The  result  of  this  command  is  a  listing  (in 
alphanumeric  order)  of  all  digital  image  files  (files  with 
names  ending  in  “.VD")  which  exist  in  the  NSMITH  directory. 
If  the  desired  file  is  not  present  in  the  NSMITH  directory, 
it  may  be  moved  from  directory  “FACEPICS"  by  typing  the 
following  command  (while  in  the  directory  FACEPICS)  at  one 
of  the  Eclipse  terminals: 

MOVE/V  NSMITH  filename. VD 

As  before,  while  in  directory  FACEPICS,  the  user  may 
examine  the  files  available  in  this  directory  by  use  of  the 
“LIST/A/S  -.VD"  command. 

Once  the  user  knows  the  filename  of  the  digital  image 
he  wishes  to  display,  and  has  confirmed  it's  presence  in 
directory  NSMITH,  he  may  display  that  file  by  using  the 
program  Octek.  This  program  is  activated  by  typing  the 
following  at  the  Nova  terminal: 

OCTEK 

The  user  should  select  option  ”2"  when  the  Octek  menu 
appears.  This  option  allows  the  user  to  enter  the  filename 
(which  includes  the  ".VD”  suffix)  of  the  digital  image  to  be 
loaded.  Once  the  user  enters  the  full  filename  and  presses 


return,  the  image  Is  simultaneously  loaded  into  the  IAC  and 
displayed  on  the  monitor.  If  the  user  accidentally  selected 
the  wrong  image,  he  may  select  option  "2"  again  and  repeat 
the  process.  The  user  then  exits  the  Octek  program  by 
entering  " - 1 "  and  return  at  the  keyboard. 

If  the  user  knows  the  name  of  the  previously  saved 
image  file  and  is  certain  that  it  exists  in  the  current 
directory  (NSMITH),  or  the  user  wishes  to  acquire  the 
desired  image  via  the  video  camera,  he  may  accomplish  this 
as  part  of  the  initialization  routine  of  the  AFRM. 

INITIALIZING  THE  AFRM 

The  AFRM  is  initialized  by  typing  the  following  command 
at  the  Eclipse  terminal  (background  or  foreground  terminal) 
while  in  directory  "ESMITH”: 

FINDFACE 

The  user  should  verify  the  presence  of  the  following 
successful  initialization  message  (at  the  Eclipse  terminal): 

*  *  *  READY  TO  PROCESS  PICTURE  DATA  *  *  * 

The  above  is  all  that  need  be  done  at  the  Eclipse 
terminal.  The  rest  of  the  operator  steps  are  performed  at 
the  Nova  terminal. 

At  the  Nova  terminal  the  AFRM  is  initialized  by  typing 
the  following  command: 

AUTOFACE 

If  the  Nova  detects  that  the  Eclipse  has  not  been 
initialized  properly,  it  displays  Instructions  to  the 


operator  about  how  to  initialize  the  Eclipse  terminal.  The 
Nova  then  waits  until  the  Eclipse  has  been  initialized. 

When  the  Nova  detects  that  the  Eclipse  has  been 
properly  initialized  it  outputs  the  following  to  the 
operator: 

ALL  SYSTEMS  READY 

WOULD  YOU  LIKE  TO  SET  UP  AN  IMAGE  USING  THE  OCTEK 

PROGRAM  ?  (1-YES,  O-NO)  : 

ACQUIRING  AN  IMAGE  AFTER  AFRM  STARTUP 

If  the  user  wishes  display  a  previously  saved  image 
file,  or  he  wishes  to  acquire  an  image  through  the  video 
camera  and  save  that  image  in  memory  before  proceeding,  then 
he  should  select  "1".  Retrieval  of  a  previously  saved  image 
is  performed  as  discussed  earlier,  except  Octek  does  not 
give  the  user  an  opportunity  to  see  what  Image  files  are 
currently  available.  The  user  must  know  beforehand  the  exact 
name  of  the  image  file  he  wants  to  display.  If  the  user 
wishes  to  acquire  an  image  via  the  video  camera  and  save  it 
to  memory,  he  must  perform  the  save  at  this  point  in  the 
process.  The  user  will  be  given  another  opportunity  to 
acquire  an  image  via  the  video  camera  at  a  later  point  in 
the  algorithm.  Thus,  if  the  user  simply  wishes  to  acquire  an 
image  via  the  video  camera,  and  is  not  concerned  about 
saving  the  image,  then  he  may  skip  this  option  by  selecting 


ACQUIRING  AND  SAVING  AN  IMAGE  PILE 

If  the  user  desires  to  acquire  an  Image  and  save  it  to 
memory,  then  he  must  use  the  Octek  program.  The  user  should 
ensure  that  the  video  camera  has  been  turned  on  and  warmed 
up  a  few  minutes  prior  to  attempting  to  acquire  the  picture. 
Once  the  Octek  menu  is  displayed,  the  user  may  acquire  the 
image  by  selecting  option  "1"  from  the  menu.  This  action 
causes  the  monitor  to  display  (in  real  time)  the  image  being 
generated  in  the  video  camera.  At  this  point,  the  user 
should  make  the  following  adjustments  at  the  camera: 

1.  F-Stop  setting  of  F8.0 

2.  Focus  at  30  ft. 

3.  Zoom  -  as  appropriate  (see  below). 

If  faces  are  present  in  the  desired  image,  the  operator 
should  ensure  two  things  for  successful  program  performance. 
First,  the  eyes  of  any  subjects  should  be  located  in  the 
central  part  of  the  screen.  Figure  B-l  illustrates  the  area 
of  the  image  in  which  the  eyes  (both  of  them)  must  appear. 
Second,  since  the  program  only  finds  faces  within  a  certain 
range  of  facial  sizes,  the  zoom  of  the  camera  must  be 
adjusted  so  that  the  subject  faces  fall  within  this  range. 

A 3  a  rule,  the  facial  image  should  not  be  larger  than  3.5 
inches  (when  measured  from  bottom  of  chin  to  top  of  head,  on 
the  TV  screen)  and  should  not  be  smaller  than  2.0  inches. 

When  the  previous  adjustments  have  been  accomplished, 
the  user  then  freezes  the  image  by  hitting  the  "return"  key 
at  the  Nova  terminal.  He  may  then  save  the  image  displayed 


ea 


by  selecting  option  "3"  on  the  Octek  menu  and  entering  an 
appropriate  filename  for  the  image.  The  user  then  exits  the 
Octek  program  by  entering  a  - 1 "  at  the  terminal.  This 
action  returns  the  user  to  the  main  AFRM  algorithm. 

At  this  point,  initialization  of  the  AFRM  is  complete 
and  the  following  AFRM  menu  is  displayed: 


Autonomous  Face  Recognition  Machine 


Keypad  Menu 


5  6 


l--camera  #1  on 
4--camera  #1  off 
6--process  picture 

8 -  terminate  and  exit  to  system 

Here  the  user  is  given  an  additional  opportunity  to 
acquire  an  image  via  the  video  camera.  If  the  user  desires 
to  do  so,  he  must  ensure  that  the  camera  is  on  and  warmed  up 
(about  10  minutes)  before  selecting  keypad  button  #1  on  the 
Octek  keypad.  Note  that  the  previous  instruction  refers  to 
the  Octek  keypad ,  and  not  the  terminal  keyboard  ”1".  The 
user  must  still  ensure  that  the  proper  camera  adjustments 
are  performed  as  described  earlier.  Once  the  user  is 
satisfied  that  he  is  ready  to  take  the  picture,  he  need  only 
press  the  Octek  keypad  button  4  .  Again,  the  user  is 
cautioned  that  an  image  acquired  in  this  fashion  will  not  be 
saved.  The  image  will  be  altered,  and  eventually  erased, 
once  the  picture  is  processed. 


Image  Processing 

Assuming  that  Che  Image  that  Che  user  desires  Co  be 
analyzed  is  now  loaded  In  Che  IAG ,  he  is  now  ready  Co  begin 
Che  process  of  analyzing  Che  Image  for  Che  presence  of 
faces.  To  begin  Chls  process  Che  user  muse  press  Che  OcCek 
keypad  buCCon  #6. 

ANALYSIS  OF  THE  SEARCH  AREA 

Analysis  of  Che  search  area  (UlusCraCed  in  figure  B-l) 
proceeds  from  Che  bottom  to  the  Cop  of  the  search  area.  This 
process  can  take  anywhere  from  5  to  30  minutes  depending 
upon  the  nature  of  the  background  In  the  Image  search  area. 
During  this  time,  the  user  is  kept  informed  concerning  where 
the  program  Is  in  the  analysis.  At  the  Eclipse  terminal,  the 
user  is  constantly  informed  about  which  row  of  the  search 
area  is  currently  being  analyzed.  The  analysis  proceeds  from 
the  bottom  row  (row  176)  to  the  top  row  (row  56)  in 
Increments  of  2  rows. 

As  possible  eye  signatures  are  located,  messages  are 
displayed  at  the  Nova  terminal  indicating  the  event.  As  the 
Nova  extracts  64  by  64  image  sub-sections  (for  use  in 
further  analysis  by  the  Eclipse),  it  displays  a  box  cursor 
on  the  TV  monitor  which  outlines  the  area  of  interest.  Upon 
completion  of  analysis  of  the  image  sub-section  by  the 
Eclipse,  the  Nova  displays  messages  indicating  the  results 
of  the  analysis.  If  a  face  was  successfully  found  in  the 
image  sub-section,  then  a  message  to  that  effect  Is 


displayed.  In  addition,  the  Nova  places  a  set  of  grid  marks 


on  the  displayed  image,  at  the  location  where  the  face  was 
found.  These  grid  marks  indicate  the  internal  region  of  the 
face  and  the  locations  of  certain  facial  features.  The 


facial  feature  locations  indicated  by  the  grid  marks  are: 

1.  Left  and  right  sides  of  the  eyes 

2 .  Center  of  face 

3 .  Top  of  nose 

4.  Center  of  mouth 

The  two  horizontal  grid  marks  located  above  the  nose 
mark  do  not  correspond  directly  to  any  facial  feature.  The 
position  of  these  marks  are  determined  by  the  distance 
between  the  two  maxima  in  the  eye  signature  and  the  top  of 
nose  location.  By  adding  the  distance  between  the  two 
maxima,  once,  to  the  top  of  nose  location,  the  grid  mark 
centered  about  the  level  of  the  eyes  is  determined.  By 
adding  this  distance,  twice,  to  the  top  of  nose  location, 
the  top  horizontal  grid  mark  is  obtained.  Figure  B-2 
illustrates  an  example  of  the  grid  marks  displayed  when  a 
face  is  found. 

The  AFRM  will  process  up  to  four  facial  Images,  for  any 
single  test  image.  As  each  image  is  found,  in  addition  to 
the  actions  indicated  above,  the  Nova  will  save  (to  disk) 
the  64  by  64  image  sub-section  containing  the  facial  image. 
The  first  such  image  is  saved  to  disk  as  the  file 
"fESTl.PI",  the  second  such  image  is  saved  as  "TEST2.PI”, 
and  so  on.  Appended  to  these  image  files  are  the  data 
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Figure  B-2.  Example  of  Grid  Narks  Indicating  a  Face  was  Found. 


Indicating  the  locations  for  the  various  features  (grid 
narks)  that  were  found,  and  the  enhancement  multiplier  used. 

If  a  face  was  not  found,  or  one  was  found  but  was  too 
large  to  fit  in  the  64  by  64  image,  the  operator  is  informed 
of  these  events  by  messages  to  the  Nova  console.  The  AFRM 
then  continues  with  the  analysis  of  the  search  area  until 
four  facial  Images  are  located  or  analysis  of  the  search 
area  has  been  completed. 

If,  at  any  time  after  the  first  facial  image  has  been 
found,  the  user  chooses  to  abort  the  rest  of  the  search,  he 
nay  do  so  by  pressing  the  Octek  keypad  button  i 7.  This 
action  will  Immediately  terminate  the  process  of  searching 
for  faces  in  the  test  image.  The  Nova  console  then  displays 
the  following  message  and  prompt: 

FACES  FOUND  WITHIN  THE  BOX  CURSOR  AREA  -  "X" 

WOULD  YOU  LIKE  TO  PRINT  THIS  IMAGE  ?  (1-YES, O-NO)  : 

The  value  of  "X”  is  determined  by  the  number  of  facial 
Images  found.  The  operator  is  given  the  option  to  print  the 
image  containing  the  results  of  the  analysis.  If  he  chooses 
to  print  the  image,  he  must  wait  while  the  image  is  being 
printed  (1  or  2  minutes),  before  proceeding  with  any  other 
operations.  The  operator  is  forced  to  wait  while  the  image 
is  being  printed  because  the  Eclipse  is  Incapable  of 
performing  other  operations  at  this  time.  If  the  Nova  were 
allowed  to  proceed,  it  might  very  well  get  out  of  (task) 
synchronization  with  the  Eclipse.  If  no  faces  were  found, 
the  program  on  the  Nova  then  returns  to  the  initial  AFRM 


menu,  and  is  ready  to  start  again,  or  quit,  as  the  operator 
decides . 

If  faces  were  found  in  the  image  search  area,  then 
after  responding  to  the  user's  selection  concerning  printing 
the  image,  the  following  prompt  is  displayed  at  the  Nova 
console: 

Would  you  like  to  : 

1  -  TRAIN  WITH  THESE  FACES  ? 

2  -  IDENTIFY  THESE  FACES  ? 

3  -  QUIT  ? 

The  operator  should  then  enter  a  1 ,  2  or  a  3  (on  the 
Nova  keyboard)  depending  on  which  option  he  chooses.  If  the 
operator  chooses  option  3,  then  the  AFRM  is  re-initialized 
and  the  initial  menu  is  displayed  again.  The  operator  may 
then  acquire  a  new  image  and  start  the  process  over  again, 
or  terminate  and  exit  to  the  system. 

GENERATING  FACIAL  GESTALTS 

If  the  operator  chooses  option  1  or  2,  the  original 
test  image  is  removed  from  the  screen,  and  a  new  display  is 
generated  which  illustrates  the  enhancement  and  facial 
windows  used  in  calculating  the  facial  gestalt  values.  The 
order  of  processing  follows  the  order  in  which  the  faces 
were  found  in  the  original  image.  Thus,  the  image  stored  in 
image  file  "TEST!. PI"  contains  the  first  image  processed. 
Figure  B-3  Illustrates  the  results  of  this  process. 


on  Gestal 


After  determining  the  facial  window  gestalts  for  each 


facial  image,  the  AFRM  proceeds  in  a  different  manner 
depending  upon  whether  the  operator  chose  to  train  or 
identify  the  facial  images. 

STORING  THE  GESTALT  RESULTS 

If  the  operator  chose  to  train  with  the  facial  images, 
then  he  must  supply  additional  information  so  that  the  data 
may  be  properly  stored  in  the  facial  database.  The  first 
thing  the  operator  must  do  is  respond  to  the  following 
promp  t : 

PLEASE  ENTER  A  NAME  FOR  THIS  PICTURE  FILE: 

The  operator  responds  by  entering  a  descriptive  name 
for  the  file  which  contains  no  more  than  8  characters,  plus 
the  ".PI**  extension.  For  example,  the  last  name  of  the 
individual  illustrated  in  figure  B-3  is  "King”.  Thus,  an 
appropriate  filename  might  be  "KING. PI".  If  a  file  already 
exists  in  the  directory  which  has  this  filename,  then  the 
following  prompt  is  displayed: 

*  *  *  FILE  ALREADY  EXISTS  *  *  * 

WHAT  WOULD  YOU  LIKE  TO  DO  ? 

1  -  TRY  ANOTHER  FILENAME 

2  -  WRITE  OVER  EXISTING  FILE 

If  the  operator  chooses  "1",  then  he  is  given  another 
opportunity  to  enter  a  filename.  If  option  "2"  Is  selected, 
the  program  deletes  the  existing  file,  and  creates  a  new 
file  (under  the  chosen  filename)  containing  the  data  from 


TEST1.PI.  The  operator  Is  then  required  to  respond  to  the 
following  prompt: 

Do  you  want  to  store  this  record  ?  ( 1«  Ye s , 2- no  )  : 

If  the  operator  wishes  to  store  the  window  gestalts  for 
the  facial  Image  which  has  just  been  processed,  then  he 
responds  with  a  ”1".  Otherwise,  he  enters  a  "2",  and  the 
AFRM  proceeds  to  the  next  facial  Image. 

If  the  operator  selects  to  store  the  window  gestalts, 
he  is  presented  with  the  following  menu  options: 

Please  enter  the  ID  number  of  this  person: 

1  -  Enter  ID  number 

2  -  Add  a  New  Name 

3  -  Use  last  ID  number  in  the  system 

(LAST  ID  NUMBER  -  20  CAPT  ED  SMITH) 

9  -  Look  at  or  Edit  Previous  Records 

If  the  subject,  to  whom  the  face  belongs,  already  has 
an  ID  number,  but  the  user  does  not  recall  what  the  ID 
number  is,  he  should  select  option  “9”.  This  option  provides 
a  listing  of  all  the  Individuals  (currently  in  the  system) 
and  their  associated  ID  numbers.  He  may  then  return  to  the 
above  menu  and  select  option  " 1 “  to  enter  the  desired  ID 
number  of  the  individual. 

If  the  subject  has  not  been  previously  entered  into  the 
database,  then  option  "2"  should  be  selected.  The  operator 
must  then  respond  to  various  prompts  for  personal 
information  associated  with  the  subject,  such  as  the 
subjects  name,  default  picture  filename,  and  speech 
synthesizer  string.  The  speech  synthesizer  string  determines 
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what  the  computer  will  verbally  say  (via  the  DECTALK  speech 
synthesizer)  when  It  has  determined  the  identity  of  the 
individual  (in  any  future  indentif ica tion  tests).  Entries 
for  the  speech  synthesizer  are  written  pretty  much  the  same 
as  you  might  write  a  sentence,  such  as  "Hello,  Captain 
Smith.  How  are  you  today?".  Sometimes  the  spelling  must  be 
altered  to  obtain  the  proper  enunciation. 

Once  the  system  has  an  ID  number  to  associate  with  the 
subject's  facial  window  gestalt  data,  the  data  is  stored  in 
the  facial  database  under  the  given  ID  number.  This  data 
will  be  used  to  train  the  system  the  next  time  the  database 
statistics  are  updated.  It  is  important  to  note  at  this 
point  that  the  system  has  not  been  actually  "trained”  with 
the  image  data.  The  data  has  merely  been  stored  in  the 
database.  To  actually  train  the  system,  the  user  must  select 
to  update  the  database  statistics  using  the  program  "MAIN”. 
This  process  13  accomplished  after  the  AFRM  has  been 
terminated.  Instructions  for  the  use  of  the  program  MAIN  are 
contained  in  Russel's  thesis  (22:B-39)  and  are  not  repeated 
here. 

At  this  point  the  system  retrieves  the  next  facial 
image  that  was  found  and  proceeds,  as  before,  to  generate 
the  facial  windows  and  associated  gestalt  values  for  the  new 
image.  These  values  are  then  stored  as  described  previously. 
This  process  continues  until  all  the  facial  images  that  were 
found,  have  been  processed.  The  system  is  then 
re - 1 n 1 1 i a  1 1  zed  and  the  initial  AFRM  menu  is  re-displayed  to 
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Che  operator.  The  operator  may  then  choose  to  acquire  a  new 
test  image  for  subsequent  analysis  or  to  exit  to  the 
operating  system. 

IDENTIFYING  AN  UNKNOWN  SUBJECT 

If  the  operator  chose  to  identify  the  facial  images 
that  were  originally  found  by  the  analysis  of  the  search 
area,  the  program  proceeds  in  a  different  fashion  (than  for 
training).  After  determining  the  facial  window  gestalts  (as 
illustrated  in  figure  B-3)  for  a  particular  TESTX.PI  image, 
the  A FRM  immediately  proceeds  to  the  "identification  phase" 
of  the  process  (for  the  facial  image  in  question).  If  the 
operator  wishes  to  use  the  DECTALK  speech  synthesizer  during 
this  phase,  he  must  activate  it  prior  to  the  identification 
phase.  This  is  done  by  turning  on  the  DECTALK  unit  and 
ensurl-ng  that  the  appropriate  communication  channels  have 
been  opened  between  the  Eclipse  system  and  the  DECTALK.  The 
DECTALK  does  not  have  to  be  activated  in  order  for  the 
identification  process  to  proceed. 

The  identification  process  generates  a  new  display  on 
the  TV  monitor.  An  example  of  this  display  is  indicated  in 
figure  B-4.  The  test  image  (containing  the  unknown  subject) 
is  displayed  in  the  upper  left  corner  of  the  display.  The 
results  of  the  analysis  are  displayed  as  follows: 

1.  A  default  picture,  representing  the  computer's  first 
choice  for  who  the  subject  is,  is  displayed 
immediately  below  the  image  of  the  unknown  face. 

2.  The  name  of  the  computer's  first  choice  is  displayed 
to  the  right  of  this  #1  candidates  default  image. 

3.  Default  images  for  the  1st,  2nd  and  3rd  runner's  up 
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are  displayed  below  the  #1  candidate's  image. 

4.  To  the  left  of  each  candidates  default  image,  a 

pseudo-probability  figure  is  displayed.  This  figure 
is  a  measure  of  how  certain  the  computer  is,  that  it 
has  correctly  identified  the  unknown  individual.  The 
larger  the  percent  difference  between  the  #1 
candidate  and  the  runner's  up  the  more  certain  the 
computer  is  of  it's  selection. 

Along  with  the  display  of  the  results,  the  Eclipse 
prints  out  the  results  of  the  identification  analysis,  on 
the  printer.  At  the  completion  of  the  identification 
analysis,  if  the  DECTALK  has  been  activated,  the  computer 
will  verbally  announce  the  name  of  the  #1  candidate  along 
with  a  verbal  greeting. 

The  only  option  available  to  the  operator,  during  the 
identification  phase,  is  whether  or  not  to  print  the  image 
which  displays  the  identification  results. 

This  entire  process  is  repeated  for  each  facial  image 
that  was  found  in  the  analysis  of  the  original  test  image. 
Once  all  faces  have  been  processed,  the  A FRM  is 
re-initialized  and  is  ready  to  begin  again. 


Training  Image  Generation 

The  following  discussion  concerns  the  generation  of 
images  used  specifically  to  train  the  AFRM  recognition 
database.  Images  such  as  the  one  illustrated  in  figure  B-5 
were  the  type  used  to  train  the  AFRM  recognition  database. 
These  images  were  constructed  using  the  program  PICTURE2. 
This  program  (which  is  fully  described  in  appendix  B  of 
Russel's  thesis)  allows  the  user  to  construct  composite 
Images  like  that  illustrated  in  figure  B-5. 
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After  activating  the  PICTURE2  program,  the  training 
Images  are  generating  by  first  selecting  the  additional 
options  menu  (Octek  keypad  05)  from  the  PICTURE2  main  menu. 
The  TV  monitor  screen  is  then  cleared  (to  a  white 
background)  by  selecting  option  04  while  in  the  additional 
options  menu.  This  action  also  returns  the  user  to  the 
PICTURE2  main  menu.  The  64  by  64  box  cursor  is  then 
activated  by  pressing  the  Octek  keypad  button  #7.  Once 
activated,  the  box  cursor  which  is  now  visible  on  the 
screen,  may  be  positioned  anywhere  on  the  screen. 

Positioning  of  the  box  cursor  is  accomplished  using  the 
upper  Octek  keypad  buttons.  The  box  cursor  should  be 
positioned,  for  each  subject  facial  image,  such  that  the 
subjects  eyes  will  fall  within  the  search  area  indicated  by 
the  large  rectangular  box  in  figure  B-5. 

After  positioning  the  64  by  64  box  cursor  at  the 
desired  position,  the  user  should  again  select  the 
additional  options  menu  by  pressing  the  Octek  keypad  button 
#5.  He  may  then  select  to  retrieve  and  display  a  64  by  64 
facial  image  which  has  previously  been  acquired  (by  the 
program  PICTURE2)  and  saved  as  file  named  " f i lename . P  I " .  The 
program  then  displays  the  chosen  facial  image,  on  the 
screen,  at  the  chosen  location  of  the  64  by  64  box  cursor. 
This  process  of  retrieving  and  displaying  a  facial  image  may 
be  repeated  until  an  image  like  figure  B-5  is  generated.  The 
resulting  image  may  then  be  saved  using  the  OCTEK  program, 
which  saves  the  full  screen  image.  Subsequent  to  this,  the 


Image  may  be  analyzed  by  the  AFRM,  and  the  resulting  facial 
Images  processed  and  stored  in  the  AFRM  recognition 
database. 

Since  all  of  the  facial  images  in  the  face  databank  (in 
excess  of  300  images  representing  more  than  50  subjects) 
were  initially  acquired  and  saved  by  PICTURE2,  as  64  by  64 
images,  this  allows  the  AFRM  to  access  all  of  that  data. 

It  is  recommended  that  future  potential  users  continue 
to  use  the  PICTURE2  program  to  acquire  facial  images  that 
will  be  used  for  training.  This  will  keep  the  facial  image 
databank  (found  in  directory  FACEPICS)  consistent. 
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.  Facial  Image  Samples  #2. 
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Plgure  E-19.  Test  Image  #10  Results 
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Figure  E-23.  Test  Iaage  #12  Results. 
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re  E-27 ,  Test  Iaege  #15  Result*. 


-28.  Test  Iaage  #16  Results 


Sof  tware  Lis  tings  : 


Listed  below  are  all  directories,  programs,  links  and 
data  files  which  comprise  the  AFRM.  Reference  is  made  to 
actual  compiled  code.  Source  code  filenames  are  the  same 
except  that  the  extensions  (  "  .  o  1 "  ,  "  .  r  b  "  and  ”.sv”  )  all 
change  to  Two  major  listings  are  provided,  one  for 

the  Nova  and  one  for  the  Eclipse. 

Following  the  two  listings,  a  slmillar  listing  is 
provided  which  applies  to  the  facial  database  management 
software.  This  listing  shows  the  programs,  etc.,  necessary 
to  execute  the  program  "MAIN". 

At  the  end  of  this  appendix,  the  Fortran  source  code 
for  the  two  top  level  manager  programs  (Facefndr.fr  and 
Getsubj.fr)  is  provided. 

A FRM  Program  Lis  tings 

Note:  Indentation  indicates  programs  which  are 
called  as  part  of  a  higher  level  program. 

Since  some  low  level  programs  are  called 
by  more  than  one  higher  level  program,  the 
low  level  programs  may  appear  more  than  once 
in  the  following  list. 

Computer  system  :  Nova 

Directory:  NSMITH 

Programs:  Autoface. me 

Efacestat.sv 
Oc  tek . sv 
Getsubj . sv 

S  vp  1  c . 8 v 
Gtgest.sv 

Newtext. rb 
P  rocess  2 . sv 
0.  ol 
D.  ol 
R  .  ol 

Showges  t. sv 

Gestalt. rb 
Ge  tname . s v 

New  text. r b 
U  r  na  2 . s  v 

R  dna 1 . r  b 
Train,  s  v 
Cleanup,  sv 
Iddlsp.sv 
Showd 1 sp  .  sv 

Nproc 1  .  rb 
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Links  : 
Data  Piles  : 


Directory: 
Programs : 
Other: 

Directory: 
Programs : 


Nprocl.rb  (continued) 
New  te  x  t . rb 
Rdna 1 . r b 
Rdna4 . rb 


Link  to  Dir  Nsave  :  Octek.sv 


Ema 1 np  1  x 
Idf lie 
Idf 1 le  2 
Idf ile3 
Idf ile4 
Idnum 
Reaea 1 . tk 
Reaea  2 . tk 
Slgaas 
Windowl 
W 1 nd  ow  2 
Window3 
Wi nd  ow  4 
UlndowS 
Windowb 
Ulndowl . 1 u 
W i ndow 2  .  lu 
Wlndow3 . lu 
Window4 . lu 
Windows. lu 
Wi ndow6 . lu 
Windowl . sp 
Window2 . sp 
Window3 . sp 
W i nd ow4  .  s p 
Windows . sp 
Wlndow6  .  sp 
Wf ac tor 


OC  TEK 
None  . 

IACF4.LB  (Octek  library  of  functions) 


NSAVE 
Octek. s  v 


End  Nova  System 


.  o  «  p  u  t  e  r  tyiCfi  Eclipse 

Directory.  E  S  N  I  TH 

Progmi  :  Ftndf  *c«  .  »c 

Cleanup.  .  «  v 
Ftcrf  ndt  *  v 

2  1  e  a  n  u  p  1  #  v 

Rtransarb 
M»wp  !  i  rt 


ii  '  *  x  <  r  h 
>ut  h  1  r  ( 
!nrtr»aib  tv 

Ktrtnn  r  b 
R  t  r  a  n  *  b  r  ^ 

R  e  a  i  d  *  v 

T  a  1  k  f  1  1  e  r  b 
*»trlfv»  r  b 
Add  r  b 

Sort  r  b 
Probib. rb 
Idntar  rb 
R  dnaat 2 .  r  b 
C  o  p  v  f  1  la.  rb 

A  p  p  e  n  d  1  t  .  rb 
Rdntar R  rb 
Count  1  t .  rb 
Talker,  r  b 
Rtraa.rb 
E  y  e  f  .  o  1 
Rtranib, rb 
P  t  h  o  r  .  o  1 


Data  Flies:  None. 


Directory:  NSNIT.l 
Program  :  None . 


Links 


Link  to  E  s  m  1  t  h  :  C  1  e  a  n  u  p  2  .  s  v 
Link  to  E s a l t h : F a c e f n d r  .  s v 
Link  to  Esffll th : Face  f  ndr . ol 
Link  to  E  s  m  l  t  h  :  C  1  e  a  n  u  p  3  .  s  v 
Link  to  E  s  ra  1  t  h  :  C  o  r  t  r  a  n  1  b  .  s  v 
Link  to  Esil  th  :Re«ld.  sv 


Data  Flies:  Same  as  for  Nova  (see  Nova  System/NSNITH 
Data  Flies) 


"vs 


Directory: 
Programs  : 
Links: 
Data  Flies: 


Faceplcs 

None. 

None . 

Eaa l np  1  x 
Eaa i npi x . bu 

-.pi  (all  64  by  64  face  image  files) 


End  Eclipse  Systea. 
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Facial  Database  Management  Program  Listin 


The  following  is  a  program  listing  for  the  software 
used  to  maintain  the  face  recognition  database.  Except  for 
changes  to  incorporate  speech  synthesis  and  a  display  mode 
(when  performing  recognition),  the  software  is  the  same  as 
that  documented  in  Russel's  thesis.  Reference  Appendix  B  in 
Russel's  thesis  (page  B-39). 

Computer  System:  Eclipse 

Directory:  Esmith 

Programs:  Main. me 

Main. ol 
Main. sv 
S.ol 

Crea  te i t . rb 
Rdname . rb 
Addit. rb 
Rem . ol 

Talkf ile. rb 
Re  tr ieve . rb 
Probab . rb 
Rdname . rb 
Add . rb 

Sort.rb  Same  files 

Rdname2.rb  as  for 

Copyfile.rb  Remid.sv 

Appendit.rb  (see  p.F-4) 

Rdname  3 . rb 
Count! t. rb 
Talker . rb 
Mark . ol 
Pac . ol 
Loa . ol 

Copyf ile . rb 
L i s  tr . ol 
Selec . ol 
Change v . ol 
Tes  ta . ol 

Add i t . rb 
Exa . ol 

Da  ta  r e  t . rb 
Newmemory . rb 

Crea  te i t . rb 

Links :  None . 

Data  Files :  None . 


Directory:  Nsmlth 
Programs:  None. 

Links:  Link  to  Esmith:  Main.sv 
Link  to  Esmith:  Main.ol 

Data  Files:  Emainpix 
Idf lie 
Idf i le  2 
Idf ile3 
Idf ile4 
Idnum 
Remem  1 . tk 
Remem2 . tk 
Sigmas 
Windowl 
Window2 
Window3 
Window4 
Windows 
Window6 
Windowl . lu 
Wlndow2 .  iu 
Window3 . lu 
Window4.1u 
Windows . lu 
Window6 . lu 
Windowl . 8  p 
Window2 . sp 
Window3 . sp 
Window4  .  sp 
Windows. sp 
Window6 . sp 
Wf ac  tor 

End  Eclipse  System. 


Computer  System:  Nova 

Note:  The  following  programs  are  used  l^f  the  user 
wishes  to  display  (on  the  TV  monitor)  the 
results  of  a  recognition  test  which  has  been 
performed  using  "MAIN"  on  the  Eclipse.  Main 
will  work  just  fine  without  them,  if  the  user 
chooses  not  to  use  the  display  mode. 

Directory:  Nsmlth 

Programs:  Display. me 

Cleanup. sv 
Clearl t . sv 


Note:  Data  files  are  the 
same  files  listed 
for  the  AFRM  under 
Nova/Nsmi th. 
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Demo  1 . me 

Showd i sp . sv 
D  t i tie . sv 
DProc  1 .  sv 
DFea  t . s v 
Proc 1 b . sv 
DP  roc  2 . sv 
Showgest. sv 


ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc 


c  c 
c  c 

C  NAME:  FACEFNDR . PR  10/12/86  C 
C  AUTHOR:  E. SMITH  C 
C  PURPOSE:  GIVEN  A  192X128  PICTURE  OF  A  HUMAN  FACE  AS  C 
C  ACQUIRED  BT  USINC  THE  "GETSUBJ"  PROGRAM  ON  THE  C 
C  NOVA,  THIS  PROGRAM  HILL  DETERMINE  THE  LOCATION  OF  C 
C  ALL  THE  FACIAL  FEATURES  REQUIRED  FOR  FURTHER  PROC-  C 
C  ESSINC  OF  THE  FACE  RECOGNITION  MACHINE.  C 
C  C 
C  TO  COMPILE  AND  LINK,  USE  MACRO:  FACEFNDM.MC  C 
C  BISTORT:  COMES  FROM  CORTRAN16.FR  C 
C  C 
C  C 
C  C 


CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

EXTERNAL  ETEF 
EXTERNAL  FTHOR 

INTEGER  SUBFILE (27), IPIX(61A4) 

REAL  RINP(6A) ,ARRAT(130) ,PARAM( 16)  , CARR AT (64) 

INTEGER  ISTAT(20),DUMMT(64),GESTDATA(300),PRNTARRAT(6,10),IFOUND(15) 
INTEGER  STR IPDATA ( 400) 

INTEGER  UHATFILE 
REAL  MINMEAN , HAXHEAN 
IPROCPIC-O 

C  SET  UP  FOR  OVERLATS 

CALL  0V0PN(14, “FACEFNDR. 0L", IER) 

C  NOTIFT  NOVA  THAT  THIS  PROGRAM  IS  OP  AND  RUNNING. 

CALL  CFILH( " CREENLIGH  T“,2, IER) 

C  INIT  FACE  FOUND  ARRAT  COUNTER  TO  "1" 

5  IFOCT-1 

C  ZERO  OUT  “PARAM’  ARRAT 

DO  10  1-1,16 
PARAM ( I)- 0 
10  CONTINUE 

C  ZERO  OUT  "CESTDATA”  ARRAT 

DO  20  1-1,300 
GESTDATA( I)-0 
20  CONTINUE 


C 


ZERO  OUT  “STRIPDATA"  ARRAT 


DO  30  I- 1,400 

STS IPDATA ( I)-0 
CONTINUE 

ZERO  OUT  " 1FOUND"  ARRAY 

DO  32  1-1,15 
IF0UN0( I)-0 
CONTINUE 

CALL  SWAPCCLEANUP3.SV,  IER) 

C  Parfora  rc lol tlal  1  za don ,  and  bypaaa  proapta. 

IF ( IPROCP IC . EQ . 0) CO  TO  33 

IPtOCPIC-O 

GO  TO  50 

C  CALCULATE  TRANSFORM  VALUES 

33  CALL  RTRANSA (ARRAY) 

C  Calculate  conatanta  for  CARRAY 

DO  36  I- 1,6* 

IFACTOR- I- 1 
C ARRAY ( I)-0 
DO  3A  J-9,64 

CARRAY ( I)-CARRAY( I) +3 . 0*ARRAY( J- I FACTOR* 63 ) 

34  CONTINUE 

36  CONTINUE 


TYPE 

CONTINUE 
TYPE 
TYPE 

TYPE  “<7>  *  *  a  FACE FNDR  --  Gaatalt  Procaaaor  Prograa  *  *  *" 

TYPE 
TYPE 
TYPE 
TYPE 

TYPE  ” <  7  > ' 

IPRN-0 
TYPE 

IWH  ATPROP- 2 

50  CONTINUE 

IP IXCT-0 

TYPE 

TYPE  "  <07>  *  *  *  READY  TO  PROCESS  PICTURE  DATA  *  *  *" 

TYPE 

TYPE 

TYPE 

TYPE 
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h' 


TTPE 

TYPE 

60  CONTIMUE 

C  Print  th«  output  (lit  TEMP. VO  If  thu  Eellpaa  rtdivei  PRNTIHAGE 

CALL  STAT ("PR NT IMAGE ", ISTAT, IER) 

IP( IER . EQ . 1 ) TYPE  "<7>SICNAL  RECIEVED  TO  START  PRINTING  PILE* 

IP(  IER.EQ. 1)CALL  NEUP I X ( DUMMY ) 


C  Now  dtlutu  thu  fllu 

IP( IER.EQ . 1 )CALL  OF ILW (" PRNT IMAGE  ", IER ) 

IP( IER.EQ. 1)TTPE  *<7>PRINTINC  IS  COMPLETED* 

1P( IER.EQ . 1 ) GOTO  JO 

C  Rulnltlulltu  If  Pucu  Racognltlon  Prograa  on  Nova  la  dona 

CALL  STATCPACEOONE",  ISTAT,  IER) 

IP( IER.EQ. DCALL  0PILW( *  PACE DONE  * , IER) 

IP( IER.EQ. 1)C0  TO  S 


C  Tarnlnata  tha  prograa 

CALL  STAT("TERMPACE", ISTAT, IER) 

IP (IER.EQ. 1)CALL  DPILH < "TERMPACE * , IER) 
IF(IER.EQ.l) STOP  "Paca  flndar  taralnatad* 

C  Calculata  Gaatalt  Valuaa  (aetlvata  C0RTRAN16) 

CALL  STAT("CALCCEST", ISTAT, IER) 

IP( IER.EQ. 1)CALL  DPILN ("CALCGEST" , IER) 

IP ( IER . NE . 1) GO  TO  63 

CALL  SUAP(*C0RTRAN16.SV",IER) 

CO  TO  50 


C  Raeognlta  an  Individual 


) 


63  CALL  STATCIDCOM", ISTAT, IER) 

IP( IER.EQ. 1)CALL  OFILW ( " IDCOM" , IER ) 
IP ( IER . NE . 1 ) GO  TO  63 
CALL  SVAP(*REHID.SV*,IER) 

GO  TO  30 

C  Chuck  for  "aubjact. vd*  plctura  raady 


65  CALL  STAT("NSHITH :SQBJECT1 .A* , ISTAT, IER) 

IP ( IER . NE . 1) GO  TO  210 
TTPE  "<15><7>ACCESSINC  IMAGE  PILE" 

IPROCP IC“  1 

CALL  DPILU(*NSHITH : SU  BJ  ECT 1 .A",  IER) 

70  CALL  OPEN(1,"SOBJECT.VD",3,IER) 

IP( IER.EQ. 1)G0  TO  76 
TTPE 

TTPE  *•  *  ERROR  OPENING  IMAGE  PILE  *  ** 
76  IP(STRIPDATA(386)  ,EQ.  DGO  TO  80 

IS8LEO- 1 
GO  TO  90 


F-ll 


l 


& 


Jill 


80  ISBLK0- (90- STRIP DATA (  395)  )  /  18 

90  DO  180  K-ISBLK0.5 

IS  BLK- 72-(E-l)*18 

STRIP DA TA(393)“  IS  BLK 

CALL  RDSLK(1,ISBLK,IPIX,24,IER) 

IP(STRIPDATA(386) .  EQ . 1 ) GO  TO  100 

ISHFT20-0 

CO  TO  110 

100  ISHPT20-2A-ST«IPDATA( 39A) 

110  DO  178  ISH  FT2-  ISH  FT20,  24 , 2 

ISHFUHV-24-ISHFT2 
STRIP DATA (394)- ISH  FTIHV 
N-l 

DO  130  J-1,192 

IFACTOR- J+ ISH  FTIHV* 19  2 

DO  120  1-1,8 

IVAL- ( I-l)*192+IFACTOR 
RINP(  I)-  1  5-  IP  IX(  IVAL) 

120  COMTIHOB 

CALL  ITRAH(ARtAT,CAttAT,tINP) 

BNAX-0 

IR3D-0 

JR3D-0 

DO  140  1-1,37 

IP(R1HP(1).LB.BHAX)C0  TO  140 
BHAX- I IMP( I) 

IR3D-I 

JR3D-J 

140  CONTINUE 

STRIP DATA(H)-IR3D 
N-M+l 

STRIPDATAOO-JR30 
H-  N-f  1 

130  CONTINUE 

ISTOPETE-176-( 24- ISH FTINV >- ( K- 1 )* 24 
CALL  0VL0D( 4 , ETE F , 1 , IER ) 

CALL  ETEFIND(STRIPDATA) 

C  Rtf ■ t  falft  tiara  Indicator 

STRIP DA TA(386)-0 

IF(STRIPDATA(38S) . EQ . 0 ) CO  TO  176 

C  VERIFT  THIS  ISH'T  A  PREVIOUSLY  FOUND  FACE 

IF(IF00HD(1) .EQ. 1)G0  TO  132 
CO  TO  190 
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152 


IP( I STOPS YE.  GE. IFOUND( 2) . AND. ISTOPEYE. LE. IPOOND( 3 ) ) GOTO  154 
GO  TO  158 

154  IF(STBIPDAIA(387) . GE . tFOUND(4) .AND. STRIP DATA(387) .LE. I  POUND (  5)) 

1  GO  TO  156 

GO  TO  158 

156  STRIPDATA(386)>1 

GO  TO  80 

1F( IFOUND ( 6 ) . EQ . 1 ) GO  TO  160 
GO  TO  190 

IF ( ISTOPEYE. GE. IF0UND< 7) .AND. ISTOPEYE. LE . IFOUND( 8) ) GOTO  162 
GO  TO  166 

IF(STRIPDATA(387) .GE. I FOUND ( 9 ) . AND. STR IPDATA ( 38 7 ) . LE . IFOUND(  10) ) 
GO  TO  164 

GO  TO  -166 
STRIP DATA ( 386 )al 
GO  TO  80 

IF(IFOUND(ll).EQ.l)GO  TO  168 
GO  TO  190 

IF(IST0PEYE.GE.IF0UND(12). AND. ISTOPEYE. LE . IFOUND( 1 3 ) ) CO  TO  170 
GO  TO  190 

IF(STRIPDATA(387) . GE . IP0UND( 14) .AND. STRIPDATA( 387). LE . IFOUND  (15)) 
CO  TO  172 

GO  TO  190 
STR1PDATA(386)-1 
GO  TO  80 

176  TYPE 

TYPE  "<7>E YES  NOT  FOUND  WITH  TOP  EYE  WINDOW  AT  ROW  ”, ISTOPEYE 

C  Check  If  operator  wishes  to  end  search 

CALL  STAT(’FACEDONE*,ISTAT,IER) 

IF ( IER.NE. l)GO  TO  178 
CALL  RESET 

CALL  DFILW(’FACEOONE",  IER) 

CO  TO  5 

178  CONTINUE 

180  CONTINUE 

TYPE  -<15><15X15X7>ECLIPSE  SEARCH  ROUTINE  COMPLETED’ 

IF( STR IPDATA (385) .EQ.O. AND. IFOUND( 1 ) .EQ. 1 ) 

1  TYPE  ’<15>NO  OTHER  FACES  WERE  FOUND’ 

C  NOTIFY  THE  NOVA  COMPUTER  TO  EITHER  SEND  A  64X64  SUBSECTION  OR 

C  INFORM  THE  OPERATOR  THAT  NO  FACE  WAS  FOUND. 

190  IF(STRIPDATA(383) . EQ . 0 ) GO  TO  195 

TYPE 

TYPE  ’POSSIBLE  EYES  FOUND  AT  ROW  ’.ISTOPEYE 
TYPE 

TYPE  ’REQUESTING  64X64  IMAGE  SUBSECTION  FROM  NOVA’ 


166 

168 

170 

1 

172 


158 

160 

162 

1 

164 
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19) 


SraiPDATA( 393}— I STOPS IE 
CALL  CLOSE ( 1 , IER) 

CALL  CFILH<"ETENP3*.2,IER) 

CALL  OPEN<3,“ETEMP3*,2,IER) 

IK  IEE.EQ. l)CO  TO  200 

TYPE  "*  *  EEEOE  OPENIHC  STEIPOATA  PILE  *  *" 
CONTINUE 

WRITE  BINARY ( 3 )  ( STR IPDATA ( I) , I- 38 5 , 400 ) 

CALL  CLOSE( 3 , IER) 

IP( IER . HE . 1 ) TYPE  "CLOSE  ERROR  ON  STR IPDATA  PILE* 
CALL  0FILW("C0RDPTS64.B", IER) 

RENAME  "NSMITH : ETEMP 3 * , " NSM I TH :C0RDPTS64 . B" 
IP(STRIPDATA(383) . NE . 0 ) GO  TO  210 
IPROCPIC-O 

IF( ISTOPEYE. LE. 55 ) CO  TO  685 
CO  TO  50 


CONTINOE 

Chtck  for  IMAGE  SUBSECTION  road? 

IF(  IP ROC PIC . NE . 1 ) GO  TO  60 

CALL  STAT(" NSMITH : NOVAS IC1 . A" ,  IS TAT, IER) 

IF( IER . ME . 1 ) GOTO  60 

CALL  0PEN(l,*VINDl.PI‘a,3,IER) 

WHATFILE-1 

CONTINUE 


TYPE  *<07>  *  *  *  FILENAME  RECEIVED  PROM  NOVA  *  *  ** 
TYPE 

CONTINUE 

CALL  RDBLK(1,0,IPIX,16,IER) 

IFRSriME-0 

IENHAN2-0 

CONTINUE 


TYPE  "Input  P 1 la  Raad' 
GESTDATA(257)-0 


PERFORM  GESTALT  CALCULATIONS  ON  SUCCESSIVE,  8  PIXEL  HIDE, 
HORIZONTAL  UINDOUS. 

IBKUP-0 

ISHPLO-24-IPRSTIME*(31-CESTDATA<  27  2)) 

ISHPH  1“ ISH  FL0+6-IFRSTIMEK  3 1-CESTDATA(  272) ) 

DO  310  ISH  PDUN-  ISH  FLO,  ISH  PH  I,  2 

M-l 

ISH  I  FT*  ISH  PDUN+6-  1 8KUP 
DO  290  J-1,6* 

IFACTOR-JMSH IFT*64 
DO  260  1-1,8 

I V  A  L- ( I-1)*64*IFACT0R 
RINP( I)-15-IPIX( IVAL) 


v.tos: 


.'•.V.NV-.N 


v> 


260 


CONTINUE 


C 


280 


290 


C 

300 


310 

320 


C 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


Oo  Gestalt  on  coluaa 

CALL  RTRAM(ARRAY,CARRAY,RINP) 

BHAX-0 

IR3D-0 

JR3D-0 

DO  280  1-1,64 

IF ( RINP ( I ) . LE .  BMAX ) GO  TO  280 
BHAX- RINP( I) 

IR3D- I 
JR3D-J 
CONTINUE 

GESTDATA(M)-IR3D 

M-M+l 

GESTDATA(N)-JR3D 

H-N+l 

CONTINUE 

ITOPEYE- ISH FDWN+7-  IBKUP 
CALL  0VL0D(4, FTHOR, 1 , IER) 

CALL  FEATHOR(GESTDATA) 
IF(GESTDATA(257).EQ.O)GO  TO  300 
CO  TO  330 

Slnca  previous  aCCaapt  failed,  aove  up  two 


( 

> 


I 

! 

< 


i 

i 


rows  and  repeat  proceaa 


CONTINUE 
IBKUP- IBKUP+4 

TYPE 

TYPE  ”<7>EYES  NOT  FOUND  UITH  TOP  OF  EYE  WINDOW  AT  ROW  ".ITOPEYE 


CONTINUE 

IF(  IFRSTINE.EQ.O)TYPE  "EYES  NOT  FOUND" 

IF( IFRSriNE.EQ. DTYPE  "EYES  NOT  FOUND  AFTER  ENHANCEMENT" 
ITOPEYE-O 


GESTDATA  FICLOS  DEFINITION: 


GESTDATA 

(1)  - 

(128)  •  GESTALT 

DATA 

FOR 

THE  64 

COLUMNS 

GESTDATA 

(129) 

-(256)  -  GESTALT 

DATA 

FOR 

THE  64 

ROWS 

GESTDATA 

(  257) 

■ 

0  IF  EYES 

NOT 

FOUND 

GESTDATA 

(258) 

aa 

LEFT  SIDE 

OF  HEAD 

(not 

used ) 

GESTDATA 

(  259) 

a 

RIGHT  SIDE 

OF 

HEAD 

(not 

used ) 

GESTDATA 

(260) 

m 

CENTER  OF 

EYES 

GESTDATA 

(261) 

a 

LEFT  SIDE 

OF 

EYES 

GESTDATA 

(262) 

a 

RIGHT  SIDE 

OF 

EYES 

GESTDATA 

(  263  > 

a 

IDELMX1 

GESTDATA 

(264) 

a 

IDELHNN 

GESTDATA 

(  265) 

a 

IDELMNP 

GESTDATA 

(266) 

a 

IDELMX2 

F-l  5 


'A, 

•,«y 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

330 


340 


350 


GESTDATA 

GESTDATA 

GESTOATA 

GESTOATA 

CESTOATA 

GESTOATA 

GESTDATA 

GESTDATA 

GESTDATA 

GESTDATA 

GESTDATA 

GESTDATA 


(  267) 
(268) 
(  269) 

(270) 

(271) 
(  272) 
(  273) 
(274) 
(  27  5) 
(  276) 
(  277) 
(  278) 


1DELPK 

LOSLOPE 

LISLOPE 

RISLOPE 

ROSLOPE 

ITOPEYE 

MAXLL 

HAXRL 

TOP  OP  NOSE 
CTR  OP  MOUTH 

FALSE  EYE  OCCURENCE  INDICATOR 
MINCV 


PARAM  Plaid  definition:  (real  nuabers) 

PARAM  (1)  -  CONTRAST  MULTIPLIER 
CONTINUE 

GESTDATA ( 272)- ITOPEYE 
TYPE 

TYPE  •*  *  GESTALT  CALCULATIONS  ON  EYES  COMPLETED  *  ** 
TYPE 


IP  EYE'S  NOT  POUND,  NOTIFY  NOVA  AND  TERMINATE. 
IF(ITOPEYE.EQ.O)GO  TO  650 


Deteralne  the  "top  of  aoae“  location 


SET-UP  “CENTER  OP  PACE"  SLICE  FOR  PROCESSING 

1LLIM- GESTDATA( 273) 

IRLIM-GESTDATA(274) 

IENDU- IRLIM-ILLIMM 
ISENDW- IENDW+1 

M-  129 

DO  370  K-1,64 

IPACTOR- ( K  -  1 )*64+ILLIM-l 
DO  340  I- 1, IENDU 
IVAL-  IPACTOR+I 
RINP( I)-15-IPIX( IVAL) 

CONTINUE 

DO  350  I- ISENDW, 64 
R INP( I )- 3 . 0 
CONTINUE 

CALL  RTRANSB(ARRAY.RINP) 

BMAX-0 

IR3D-0 

JR3D-0 

DO  360  1-1,64 
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IF(RINP(I) ,LE.BMAX)CO  TO  360 
BMA  X-  R I NP (  I) 

IR3D-K 
JR3D-  I 

360  CONTINUE 

GESTDATA(M)-  IR30 
M-M+l 

GESTDATA(N)-JR30 

M-M+-1 

370  CONTINUE 

TYPE 

TYPE  "*  *  GESTALT  CALCULATIONS  FOR  VERT  WINDOW  COMPLETED  *  *" 
TYPE 

C  Find  the  loweet  point  (on  the  fece)  of  the  brlghteet  eree 

C  laaedlately  below  the  eyes. 

IF( IFRSTIME . EQ . 1 ) GO  TO  410 

C  RESET  THE  FALSE  EYE  INDICATOR 

GESTDATA(277)-0 

IGTOPEYE- 1304  2*  I  TOPE  YE 
TYPE  " ICTOPEYE  -  ", IGTOPEYE 
IH INOSE-  IGTOPEYE  +  30 
IH IVAL- GESTDATA (  IGTOPEYE) 

ILOVAL- IH IVAL 

DO  380  I- IGTOPEYE ,  IH  INOSE  ,  2 

IF(GESTDATA( I) . LT. ILOVAL) CO  TO  380 
ILOVAL- GESTDATA( I) 

IH  ILOC-  1-1 
380  CONTINUE 

C  IF  NO  MIN,  SET  FALSE  ALARM  INDICATOR  AND  CONTINUE  SEARCH 

IH  IVATST-  ILOVAL-  IH  IVAL 

IF( IH IVATST.GT. l)GO  TO  390 

TYPE 

TYPE  "NO  INTENSITY  MINIMUM  DETECTED  BELOW  THE  EYES" 

GESTDATA( 257)-Q 
CO  TO  650 

390  IOH  ILOC- IH  ILOC 

TYPE  "  IH  ILOC  -  "  ,  IH  ILOC 

C  Now,  locate  the  top  of  the  nose  which  la  usually  located 

C  just  a  few  pixel  rows  below  the  point  In  "IHILOC" 

DO  400  1-1,12 

IVAL1-GESTDATA(  IH  ILOC-rl ) 

IVAL2-CESTDATA(  IH ILOC* 3) 

IV  AL3- GESTDATA  (  IH  I  LOCO) 

IDIFF1- IVAL1- IVAL2 
IDIFF2- IVAL2- 1VAL3 
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AVC0P2-( IDIPFl+IDIPF2)/2.0 
IF(AVGOF2.LE.O)GO  TO  470 
IH XLOC- IH ILOC+2 
400  CONTINUE 
TYPE 

TYPE  “TOP  OF  NOSE  NOT  FOUND  IN  ORIGINAL  PICTURE" 

C  SINCE  NOSE  NOT  FOUND,  SET  FALSE  ALARM  INDICATOR  AND  CONTINUE  SEARCH 

GE STD AT A ( 257 )«0 
GO  TO  650 

C  Find  lowest  point  of  the  bright  area  below  the  eyes  (after  enhance) 

410  IH  ILOC-  I0H  ILOC 

TYPE  "ITOPEYE  -  ".ITOPEYE 
DO  420  1-1,10 

IF(GESTDATA( IH ILOC+3) . LT. GESTDATA ( IH ILOC+1 ) ) GO  TO  430 
IH ILOC- IH ILOC+2 
420  CONTINUE 

C  Find  the  top  of  nose  (after  enhance) 

430  IVAL-0 

DO  460  1-1,5 

IDIFF3-GESTDATA(IH ILOC+1 )-GESTDATA ( IH ILOC+3 ) 

IF( IDIPF3.CT.0.AN0. IDIPP3.GT. IVAL)G0  TO  440 
GO  TO  450 

440  IVAL- IDIFF3 

INOSLOC- IN ILOC 
450  IH ILOC- IH ILOC+2 

460  CONTINUE 

IH ILOC- INOSLOC+2 

C  Check  If  top  of  nose  found  after  enhancing  the  laage 

IF(  IVAL.GT.DCO  TO  470 
TYPE 

TYPE  “TOP  OF  NOSE  NOT  FOUND  IN  ENHANCEMENT" 

GESTDATA(257)-0 
CO  TO  650 

470  I TOP NOSE- GESTDATA ( IH ILOC- 2 ) 

C  Check  If  top  of  nose  is  In  proper  location  wrt  eyes 

ICTRTO I- GESTDATA (260) -GESTDATA (27 3) 

ITONOSE- ITOPNOSE- ITOPEYE 

INOSLL- 1 . 2* ICTRTOI 

IP( ITONOSE. GE.INOSLL)GO  TO  475 

TYPE 

TYPE  “TOP  OF  NOSE  TOO  CLOSE  TO  EYES” 

TYPE 

TYPE  "MIN  VALUE  MUST  BE  -  OR  GT  ".INOSLL 
TYPE  "ACTUAL  EYE  TO  NOSE  DISTANCE  -  ".ITONOSE 
GESTDATA(257)-0 
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475 


480 

C 

C 

C 

C 

490 

C 

500 


ft 

*  h 


GO  TO  650 

ICTRT0I-2.6*ICTRT0I 

IF(  ITONOSE.LE.  ICTETODGO  TO  480 

TYPE 

TYPE  "TOP  OF  HOSE  TOO  FAR  FROM  EYES" 

TYPE 

TYPE  "DISTANCE  FROM  EYE  TO  NOSE  -  " , ITONOSE 
TYPE  "MAX  ALLOWED  -  ".ICTRTOI 

GESTDATA(257)-0 
CO  TO  650 

IFRSTIME- IFRSTIME+1 
TYPE  "ITOPNOSE  -  ".ITOPNOSE 
GESTDATA (275)- ITOPNOSE 
INOSLOC- ITOPNOSE 
IP(IENHAN2.SQ.2)GO  TO  590 
I F ( IENHAN2.EQ.1) IFRS TIME- 1 
IE HU  AM 2- IENHAN2+1 

TYPE 

TYPE  "<07>*  *  *  Now  Creating  Picture  with  Expanded  Contraet  *  *  a* 
TYPE 

CALL  RDBLK( 1 >0, IP  IX , 16 , IER) 

MIN-15 

FIND  VALUE  OF  DARKEST  SPOT  IN  PICTURE 
DO  490  J-1,4096 
1P<IPIX(J) .LT.MIN)MIN-IPIX(J) 

CONTINUE 


EXPAND  CONTRAST  USING  MODIFIED  RUSSEL  EXPANSION  TECHNIQUE 

AMEAN-14.S 
SLOP- .05 
MAX  ITERATE- 1 2 
ITERATE-0 
PRESENT- 1 
DELTA- 8 

HINHEAN-AMEAN-SLOP 
HAXMEAN- A ME A Na SLOP 

ITOP-ITOPEYE-( GESTDATA ( 260 ) -GESTDATA (273))/ 2 

IBOT-  ITOPNOSE 

ILEFr-GESIDATA(273) 

IRICHT- GESTDATA (274) 

TYPE  "HOP  IBOT  ILEFT  IRICHT  -  "  ,  ITOP  ,  IBOT,  ILEFT,  IRICH  T 
CONTINUE 


SUM-0 

SUMSQUARE-0 

I F (  ITERATE.GT.MAXITERATE)GOTO  5  50 


F-  19 


s 


510 


ITERATE 1> ITERATE+1 

TYPE  "Iteration  * , ITERATE1 

INUNPTS-0 

00  510  I-ITOP.IBOT 

00  510  J- ILEFT, IRIGHT 
IV  AL-  (  1-1)*64+-J 

VALUE* ( IPIX( IVAL)-MIN+1)*PRESENT 
IP ( VALUE. GT. 16) VALUE*  16 
IP(VALOE.LT.  DVALUE-1 
SUM- SUH+VALUE 

SUMSQUARE»SUHSQUARE+VALUE**2 
IMUHPTS*  INUNPTS+- 1 
CONTINUE 


AVERAGE* SUN/ INUNPTS 
ITERATE- ITERATED 1 

TYPE  'Average  pixel  value  (1  TO  16  SCALE )*", AVERAGE 
TYPE 

IF(AVERAGE.GT.MAXHEAN) GOTO  520 
GOTO  530 
520  CONTINUE 

PRESENT- PRESENT- DELTA 
DELTA- OELTA/2 
GOTO  500 

530  CONTINUE 

IF(AVERAGE. LT . NINHE AN ) GOTO  540 
GOTO  550 
540  CONTINUE 

PRESENT- PRESENT* DELTA 
DELTA- DELTA/2 
GOTO  500 
550  CONTINUE 


TYPE  'PRESENT  -  *, PRESENT 
PARAH( D-PRESENT 

DO  560  1-1,4096 

VALUE- ( IP IX< I)-HIN+1)*PRESENT 
IPIX(  D-VALUE 
560  CONTINUE 


C  FIND  VALUE  OF  DAREEST  PIXEL  IN  ENHANCED  PICTURE 


NIN-  16 

DO  570  1-1,4096 

IF(IPIX(I).LT.HIN)NIN-IPIX(I) 
570  CONTINUE 


C  ADJUST  PICTURE  DOWN  TO  PUT  DARKEST  PIXELS  AT  ZERO  INTENSITY 


DO  580  1-1,4096 

VALUE-  IPIX(  D-NIN 


IP( VALUE. CT. 15) VALUE- 15 
tPIX( I)- VALUE 
580  CONTINUE 


GO  TO  240 

PINO  CENTER  OP  MOUTH 


590 


600 


610 

620 

630 


IH ILOC- IH IL0C+10 
IVAL-0 
IMOUSWT-O 
IMOULOC-O 
DO  630  1-1,7 

IDIPP4-  GESTDATA ( IH IL0C  +  3 ) -GESTDATA (  IH ILOC+1) 
I P ( IDIPP4.LE. 1 ) GO  TO  610 
INOOSWT- 1 

IP ( IDIPP4 . GE. IVAL) GO  TO  600 

GO  TO  620 

IVAL- IDIPP4 

IMOULOC- IH ILOC 

GO  TO  620 

IP( IMOUSWT.EQ. 1)G0  TO  640 
IH  ILOC-  IH  I  LOO  2 
CONTINUE 


C  Check  If  eoiith  located 

640  IP( IMOULOC . GT. 0) CO  TO  644 

TYPE 

TYPE  “MOUTH  NOT  POUND' 
G£STDATA(237)-0 
GO  TO  650 


C  Check  If  eoiith  located  In  proper  location 

644  IHOUTH-GESTDATA( IMOULOC ) 

ICTRTOI- GESTDATA (260) -GESTDATA (27 3) 

IMOUTHUL- ITOPNOSE* ICTRTOI- ICTRTOI/ 3 
IMOUTHLL-ITOPNOSE+ICTRTOI*2 

IP(IHOUTH.GE. I MOUTH  UL.ANO. I MOUTH .LE. I MOUTH  LL ) GO  TO  646 
TYPE 

TYPE  “MOUTH  LOCATED,  BUT  NOT  IN  PROPER  POSITION- 
TYPE 

TYPE  “MOUTH  LOCATION  LOWER  LIM/UPPER  LIM  -  “, IMOUTH LL, IMOUTHUL 
TYPE  “ACTUAL  MOUTH  LOCATION  AT  ”, IMOUTH 
GESTDATA ( 257 )-0 
GO  TO  650 

646  TYPE  "CTR  OP  MOUTH  AT  “.IMOUTH 

CESTDATA( 276)- IMOUTH 

C  ****************************************** 


650  CALL  C P ILU ( ' E TEMP3' , 2 , IER ) 

CALL  OPEN(3,“ETENP3“,2,IER) 

IP( IER.EQ. 1 ) GOTO  660 

TYPE  “*  *  ERROR  IN  OPENING  COORDINATE  PILE  ETEMP3  *  ** 
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wnnw 


660  CONTINUE 


C  Writ*  coordinate  point*  out  to  disk,  so  NOVA  can  use  the*. 

WRITE  BINARY ( 3)(GESTDATA( I) , I- 1 , 300 ) , < P ARAN ( I ) ,1-1,16) 

C  Close  coordinate  file 

CALL  CL0SE( 3 , IER) 

IP ( IER . NE . 1 ) TYPE  "CLOSE  ERROR  ON  COORDINATE  FILE  ETEHP3 ,  IER-",IER 

C  Delete  any  prasant  coordlnata  flies 

IF(WH ATFILE.EQ. 1)CALL  DP ILW ( "C00RDP TS 1 . B" , IER ) 

C  Renaae  our  temporary  file  to  a  coordinate  file 

C  (This  Is  necessary  becausa  the  NOVA  Is  constantly 

C  these  fllenanes,  and  sust  not  sea  thea  until  they 

C  filled  with  data.  Otherwise,  It  will  try  to  open 

C  are  first  created.) 

IP(WHATPILE.EQ.l) 

1  RENAME  "NSHITH :ETEHP3" , "NSHITH :C00RDPTS1 . B" 


670  CALL  CLOSE! 1 , IER) 

IF( IER. NE. 1)TYPE  'CLOSE  ERROR  ON  DATA  PILE,  IER-', IER 

C  Now  that  we'ra  dona  processing  the  file,  delete  the  flag  file  froa  the 

C  Nova. 

IP (WHAT PILE. EQ. 1)CALL  DPILW( 'NSHITH : NOVAS ICi . A“ , IER) 

C  IF( IER.EQ. 1)G0T0  680 

C  TYPE  "NOVASIG  NOT  DELETED,  IER-',  IER 

680  CONTINUE 

STRIP DA TA(386)-1 

IF(GESTDATA(257).EQ.O)CO  TO  683 

IPIXCT-IPIXCT+1 

IP( IPIXCT.GE. A) ISTOPE YE-0 

IF( IPIXCT. GE. A . AND. IPRN.EQ. 1 ) GO  TO  687 

IP( IPIXCT.GE. A)G0  TO  683 

I FOU  ND ( IPOCT )—  1 

IFOU  ND  (  IPDCTM)-  ISTOPE  YE- A*  ICTRTOI 
I  FOUND! IPDCT+2)- IS TOPEYE* 2* ICTRTOI 
I  POUND! IPDCT+3)- STRIP DATA (388) -2* ICTRTOI 
I POUND! IFDCT+A)-STRIPDATA(389)+2* ICTRTOI 
IPDCT- IPDCTV5 
IP( IPRN.EQ. 1)G0  TO  687 

685  IF( ISTOPEYE.GT. 53) GO  TO  70 

TYPE 


scanning  for 
have  been  properly 
thea  when  they 
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N  A 


y 
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.V.V.NA.V, 


TYPE  'TEST  INACE  SEARCH  FOR  FACE  COMPLETED* 
TYPE 

60  TO  9999 


C  Tti«  following  lection  of  code  (for  printing)  wea  stranded  by  setting 

C  IPRN-0  at  the  beginning  of  this  prograa.  It  has  been  left  in  since, 

C  In  the  future,  a  user  any  wish  to  use  it.  It  aay  be  used  by  coding 

C  an  ACCEPT  stataaent  at  the  beginning  of  this  prograa,  which  will 

C  allow  the  operator  to  deteralae  the  value  of  the  variable  “IPRN*. 

C ************************************************************* ************ 

C  Print  the  Data  on  the  Printer  (If  requested)  * 

C************************************************************************* 

C  Get  date  and  tlae 

687  CALL  FC0AY( I MONTH , IDAY, IYEAR) 

CALL  FGTIMC( IHOUR, 1HINUTE  ,  1SEC  ) 

I F ( IPRN. EQ. 1) WRITE ( 12,690) IPIXCT 
IF ( IPRN. EQ.l) WRITE (  12, 7091  ) 

IF(IPRN.EQ. 1) WRITE ( 12,700) IM0NTH/ 10, M0D( INONTH , 10) , IDAY/ 10 ,M  0D( IDAY, 10) , 
1  IYEAR/ 10 , M0D( IYEAR, 10) , IHOUR/ 10,M0D( IHOUR, 10) , IMINUTE/ 10 , MOD  (  IMINUTE , 10) 
690  FORMAT! " 1  *  *  *  Feature  Location*  for  TEST  SUBJECT  #”,I2,*  *  *  *') 

700  FORMAT ( “  Date:  * , II , II , " /" , II , II , ”/" , II , II . 5E, "Tlae :  * , II , II , * s * , II ,  II ) 

710  CONTINUE 

C  Display  Critical  paraaetar*  for  horlt  feature* 

I F  < IPRN.EQ. 1) WRITE (12, 709 3) CESTDATA (261) 

I F ( IPRN.EQ. 1)WRITE(12, 7 10 7 ) CESTDATA ( 27 3 ) 

I F ( IPRN.EQ. 1) WRITE ( 12, 7094) CESTDATA (260) 

IF( IPRN.EQ. 1) WRITE ( 12, 7 108) CESTDATA ( 274) 

IF( IPRN.EQ. 1) WRITE ( 12, 7096) CESTDATA ( 262) 

IF ( IPRN. EQ. 1 )WR1TE ( 12, 7 109) CESTDATA! 27 S) 

I  F( IPRN.EQ. 1) WRITE (12, 7 110) CESTDATA (  2 76) 

C  I F ( IPRN.EQ. 1)WRITE( 1 2 , 7 102 ) CESTDATA (268) 

C  I F ( IPRN.EQ. 1) WRITE ( 12, 7103) CESTDATA! 269) 

C  IF ( IPRN. EQ. 1 ) WRITE (12, 7 104) CESTDATA! 270) 

C  I F ( IPRN.EQ. 1) WRITE! 12,71 05)GESTDATA( 271) 

IF (IPRN.EQ.  1) WRITE < 12, 7 106) CESTDATA!  27 2) 


C  Close  printer  file  to  let  soaaona  els*  ua*  It. 

IF( IPRN.EQ. 1)CALL  CLOSE ( 1 2  .  IER ) 


7077 

FORMAT( IX, * 

*  *  *  COORDINATE  POINTS 

FOR 

7090 

FORMA  T( 1 X  ,  " 

Coordinate  Point* 

C  or 

SECTION 

7091 

FORMAT ( 1 X , “ 

") 

7094 

FORMAT! IX,  “ 

CENTER  OF  EYES 

,12) 

7095 

FORMAT( IX, " 

LEFT  SIDE  EYES 

,12) 

7096 

F0RMAT(1X," 

RIGHT  SIDE  EYES 

,12) 

7109 

FORMAT( IX," 

TOP  OF  NOSE 

,12) 

7110 

FORMAT ( 1 X  ,  “ 

CTR  OF  MOUTH 

,12) 

P-23 


C102 

FORMAIdX,” 

LOSLOPE 

m 

",  12) 

C103 

FOtMAT ( IX," 

LISLOPE 

m 

'  ,12) 

C 104 

POENATdX,* 

RISLOPE 

m 

* ,12 ) 

CIO  3 

FORMAT ( IX," 

ROSLOPE 

m 

",  12) 

7106 

PORMATdX," 

ITOPEYE 

m 

”,12) 

7107 

FORMAT ( IX," 

MAXLL 

m 

”.I2) 

7108 

FORMAT ( IX," 

MAXRL 

m 

",  12) 

IF(  IPIXCT.CE.4)G0  TO  683 


9999  CONTINUE 


TYPE  " <  7 > " 

TYPE 

TYPE 

C  Co  back  for  another... 

If  ( IS TOPE YE . GT. 33) GO  TO  70 
GO  TO  30 


END 


noooooooooooo 


cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc 

GtTSUBJ.FI  10/13/86  BY  CAPT  CO  SMITH 

Top  level  prograa  used  In  perforalng  autonoaous  fact 
location  and  rocognltlon  In  a  visual  laago. 

Usas  the  OCTCE  2000  laaga  Analyser  Card  In  tha  NOVA 

Adaptad  fron  'Plctural.fr' 
by  Janes  R.  Molten  III  (6  Nay  85) 

Uaa  * GETSUBH . MC *  to  coaplla  and  load 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

COMMON  ICTP IX ,  I  CHOICE 

INTEGER  ICT( 120) , SIR IPDATA ( A00) . GESTDATA ( 300 ) 

INTEGER  IP IX( 6 1 50) 

INTEGER  OUTFILE(AO) 

INTEGER  SUMP1LE(  20) 

REAL  PARAN( 16) 

IXL-320 

IXC-160 

IYL-2A0 

IYC-120 

NEVRRC-0 

IC-1 

LASTW-0 

IPIXCT-1 


CALL  DFILU('C0RDPTS6A.B", IBS  ) 

10  CONTINUE 

TYPE  *<33>E  <15X15X1 5><7>" 

TYPE  ’  Autonoaous  Pace  Recognition  Machine.* 
TYPE 

TYPE  ’  Keypad  Menu’ 

TYPE  '  - . . . - . 

TYPE  '  1  2  3  A  5  6  7  8' 

TYPE  “  - . - . 

TYPE  "  l-'caaera  #1  on' 

TYPE  "  A— caaara  11  off 

TYPE  '  6 — process  picture' 

TYPE  "  8---  taralnata  and  salt  to  syataa' 

IP  ( NEUREC . EQ . 1  )  GOTO  AO 

NEWREC- 1 

CALL  SINTRO  (  ICT  ,  63K  ,  IER  ) 

IF(IER.EQ.l)  GO  TO  30 
TYPE  'INTRO  ECODE  s', IER 
STOP  'UNABLE  TO  INITIALIZE’ 

30  CONTINUE 
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0000000000000 


CALL  OPEN  (3, 'IACHON. XB" , 2 , IEI) 

IP(IIt.ME.l)  TYPE  ’WASHING :  Onabla  to  accaaa  IACHON. XB" 
IP( IEB.IQ. 1)  CALL  LX1  (ICT, 3) 

CALL  HPSUH  (ICT, 0,1) 

C 

C  Inltlallxa  tha  variable*  and  device*. 

IHP-0 

TYPE 

TYPE  *  Interactive  video  Input  control’ 

TYPE  *<33>J ’ 

CALL  XflAIB(ICT) 

CALL  CSEYSCALEdCT,  1) 

CALL  CBEYSCALE( ICT, 2) 


CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

c 

C  Loop  until  oparator  chooiaa  an  option 

C 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 


40  CONTINUE 

TYPE  *<33>k* 

C  Sound  tha  annunciator 

CALL  SOB( ICT >0,0) 

CALL  SOB(ICT.l.O) 

30  CONTINUE 

C  Input  tha  bottoa  hap  row. 

CALL  S0B( ICT.0,1) 

IWED- IEDEXT( ICT) 

IP  ( IWED  .EQ.  LASTW )  GOTO  30 
LASTW- IWEO 

IP  ( IWEO  .EQ.  0)  GOTO  30 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

c 

C  Chock  for  caaara  cootrol  (laft  half  bottoa  row  key*). 

IP  (IWED  .IT.  16)  GOTO  SO 
C  Turn  on  tha  caaara? 

IP  ( IWED  .AND.  128)  GO  TO  60 
C  Turn  off  caaara? 

IP  (IWED  .AND.  16)  GOTO  70 

60  CONTINUE 

C  Turn  on  currantly  aalactad  caaara. 

IC-1 

CALL  INTLACE( ICT  ,  1 ) 

CALL  SYNC( ICT, IC, ILOCE) 

IP  (ILOCE  .EQ.  0)  GOTO  200 


CALL  VON( XCT, 0) 

CALL  80X001(101,192,128) 

CALL  HCTA8( ICT, IXC-96, IYC-64) 

TYPE 

TYPE  'Enaure  *ubj*ct  face*  are  located  within  the  box  curior* 
TYPE  'tbaa  proa*  button  /A  on  the  keypad  to  free**  the  laage.' 
GOTO  40 


70  CONTINUE 

C  Turn  off  caaera. 

CALL  VOPP( ICT) 

CALL  SYNC(1CT,0,1LOCE) 

CALL  1NTLACE( ICT , 0) 

GOTO  10 

80  CONTINUE 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

c 

C  Check  lower  row  of  button*,  right  half. 

C 

C  Check  for  exit  aalection 

IP( IUE0.AN0. 1)G0  TO  190 
C  Check  for  Procaa*  Picture  coaaand. 

IP(IWRD.AND.4)G0  TO  144 

GOTO  40  ! 

144  IHUHPASS-1 

TYPE  *<33>E  <15X15X15X7>* 

TYPE  "CREATING  PILE  OP  IMAGE  CONTAINED  UITHIN  THE  BOX  CURSOR' 
CALL  BOXCUR( ICT, 192, 128) 

CALL  HCTAB(ICT,IXC-96,IYC-64) 

148  CONTINUE 

GO  TO  (ISO, 15-3, 133, 137, 160), 1NU  HP  ASS 

150  CALL  DPILV(*SUBJECT. VD", IER) 

CALL  CPILH( 'SUBJECT. VD", 3,96, IER) 

CALL  OPEN(l, 'SUBJECT. VO’, 3, IER) 

IP  (IER  .Eq.  1)  COTO  151 

TYPE  ’ - error  opening  file IER-*, IER 

GOTO  159 

C  Extract  the  192  by  128  pixel  aearch  aree  froe  the  laage 

C  on  the  TV  ecreen. 

151  CONTINUE 

CALL  RVBLK( ICT, IP IX, IXC-96 , 192 , IYC-64, 32) 

C  TYPE  'IPIX(l)  -  ",  IPIX( 1) 

IP(  IHP.NE. 3) CALL  «RBLX( 1 ,0 , IP IX( 2) , 24 , IER) 

GO  TO  159 


CONTINUE 

CALL  R VBLK( ICT , IP IX , IXC-96 , 19 2 , ITC- 3 2 , 3 2) 

If  (  Itt  f .  ME .  3)C  ALL  HRBLEd  ,  24  ,  IP  IX(  2)  ,  24  ,  IER) 
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CO  TO  159 


155  CONTINUE 

CALL  RVBLK( ICT , IP  IX , I XC-96 , 1 9  2 ,  I YC , 3 2  ) 

IF(INF.NE.3)CALL  HRBLE(1,48,IPIX(2),24,IER) 

CO  TO  159 

157  CONTINUE 

CALL  EVBLK( ICT , IP  IX , IXC-96 , 1 9 2 , I YC*3 2 , 3 2) 

IF ( lUF.NE . 3)CALL  WRBLK< 1 , 72 , IPIX( 2) . 24 , IER) 

159  CONTINUE 
1NUMPASS- INUHPASS+1 
CO  TO  148 

160  CALL  CLOSE(l.IER) 

C  Tell  Ecllpaa  that  laaga  aearch  area  la  raady  and  to  bagln 

C  analyala  to  flad  Tacaa. 

CALL  CFILW(’SUBJECT1.A’, IER) 

TYPE 

TYPE  "<7> IHAGE  FILE  READY,  SEARCH  ROUTINE  NOW  STARTING  ON  ECLIPSE* 

C  Await  Ecllpaa  raquaat  Tor  a  64  by  64  laaga  aub-aactlon  or 

C  Indication  that  the  aaarch  Tor  Tacea  la  coaplate. 

164  CALL  STAT(*NSNITH :CORDPTS64.B*, ISTAT, IER) 

IF ( IER . EQ . 1 ) CO  TO  166 
CALL  SOB( ICT, 0 , 1 ) 

IURD-  TR0EXT( ICT) 

IF(lWB.D.AND.2)GO  TO  169 
GO  TO  164 

166  CALL  0PEN(1,'C0RDPTS64.B*,3,1ER) 

IF< IER.EQ. l)GO  TO  168 

TYPE  **  *  ERROR  IN  OPENING  STRIPDATA  PILE  *  ** 

168  CONTINUE 

READ  BINARY ( 1 )  ( STR IPOATA ( I) , I- 38 5 , 400 > 

TYPE 

TYPE  ’STRIPDATA  READ* 

CALL  CLOSE( 1 , IER) 

CALL  DFILU(*CORDPTS64.B*, IER) 

TYPE  "<33>E  <  1  5><  1 5>< 1 5><  7>* 

TYPE  *<7> MESSAGE  RECIEVED  FROM  ECLIPSE:’ 

C  IT  aaarch  la  dona  (STRIPOATA( 385)-0) ,  procaad  to  next  phaac 

C  (train  or  IdantlTy).  IT  aaarch  la  not  dona,  than  procaaa  tha 

C  Ecllpaa  raquaat  Tor  an  laaga  aub-aactlon. 

IF(STRIPDATA(385) . EQ . 0 ) CO  TO  170 
CO  TO  174 

169  CALL  CF ILK (" FACE DONE’ , 2 , IER ) 

170  IPIXCT-IPIXCT-1 

CALL  PXFILL(ICT,0,62,196,54,2) 


F-  28 


CALL  PXriLLdCT,  0,62, 196,185, 2) 

CALL  PXPILL( ICT.O, 62,2.56,129) 

CALL  PXPILL(1CT,0,256,2.56,129) 

CALL  HCTAB(ICT, 255, 255) 

TYPE  “<33>E  < 1 3>< 1 5>< 1 5><  7 >“ 

TYPE 

TYPE  'FACES  FOUND  WITHIN  THE  BOX  CURSOR  AREA  •  '.IPIXCT 
TYPE 

171  ACCEPT  “WOULD  YOU  LIKE  TO  PRINT  THIS  IMAGE  T  ( 1- YES , 0- NO) “ , IPRN 
IF( IPRN.NE. 1 .AND. IPRN.NE.O)CO  TO  171 

IF(IPRN.EQ.O)CO  TO  1711 
CALL  SWAP("SVPIC.SF",  IER) 

1711  IF(IPIXCT.LE.O)GO  TO  10 

172  TYPE 
TYPE 

TYPE  “<7>W0ULD  YOU  LIKE  TO  s  * 

TYPE 

TYPE  *  1  -  TRAIN  WITH  THESE  FACES?" 

TYPE  *  2  -  IDENTIFY  THESE  FACES?  " 

TYPE  ‘  3  -  QUIT?" 

TYPE 

ACCEPT  "  CHOICE:  ".ICHOICE 

I F ( ICHOICE. LT. 1 .OR. ICHOICE.GT. 3 ) GO  TO  172 
IF(  ICH  0ICE.EQ.3)C0  TO  10 
ICTPIX-0 

1 73 A  ICTPIX-ICTPIX+1 

IF ( ICTPIX.EQ. 1)SUMF1LE(1 )■  1 
IF(ICTPIX.EQ. 2)SUHFILE( 1 )-2 
IF(ICTPIX.BQ.3)SUMFILE(l)-3 
I F  C ICTPIX.EQ. 4)SUNFILE(l)aA 


C  Signal  Ecllpaa  to  aat  up  (activate  C0RTRAN16)  to  calculata 

C  facial  window  gaatalta. 


CALL  CFILWCCALCCEST",  2,  IER) 

CALL  STATCNOVASIGl'.ISTAT.IER) 

IF ( IER . NE . 1 ) GO  TO  1733 
CALL  0FILW("N0VASIG1" , IER) 

IF ( IER.NE. I ) TYPE  'DELETE  FILE  ERROR  ON  N0VASIC1" 
1733  CONTINUE 

CALL  CFILW('N0VASIG1", 2, IER) 

I F ( IER . NE . 1 ) TYPE  "CREATE  FILE  ERROR  ON  N0VASIC1’ 
CALL  0PEN(l,'N0VAStCl",2,IER) 

IF( IER . NE . 1 ) TYPE  "OPEN  FILE  ERROR  ON  N0VASIC1" 
WRITE  BINARTd)  SUNFILE(l) 

CALL  CLOSE ( 1 , IER) 


C  Actlvata  prograa  which  dleplaye  tha  facial  window  gaatalta. 

CALL  SWAPCCTGEST.SV",  IER) 

TYPE 

IF ( ICH  01CE.EQ.2)G0  TO  1736 
CALL  SWAP("GETNANE.SV",  IER) 

CALL  SWAP("WRNA2.SV",  IER) 
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■'Si 
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CALL  SUAP(’TRAIN.  SV"  ,  IER) 

CO  TO  1732 

1736  CALL  OF ILW { *  I DC OH “ , IE R  > 

CALL  SUAP(“CLEANUP.SV", IER) 

CALL  SWAP<" IOOISP . SV* , IER) 

CALL  CFILW(*IOCOM’,2,I£R) 

CALL  SHAPCSHOWDISP.  SV*  ,  IER) 

1732  TYPE 

ACCEPT  *UOU  LO  TOO  LIKE  TO  PRINT  THIS  IMAGE  T  ( 1- YES , 0« NO) ” , IPRN 
I f ( IPRN.NE. 1 . AND. IPRN. NE.O)GO  TO  1732 
If ( IPRN . EQ . 0 ) CO  TO  1731 
CALL  SWAP(*SVPIC. SV" , IER) 

TYPE 

1731  IF(ICTPIX.LT. IPIXCT)CO  TO  1734 
GO  TO  10 


C  Proem  ao  Ecllpaa  request  for  an  laage  sub-section. 

174  TYPE 

TYPE  'PROCESSING  ECLIPSE  REQUEST  FOR  64X64  IMAGE  SUBSECTION' 
IYCDEL>STRIPDATA(393)-130 
I XCDE L« STRIP DATA (387) -128 
CALL  BOXCUR( ICT, 64,64) 

CALL  HCTAB(ICT,IXC+IXCDEL, IYC+IYCDEL) 

CALL  RVBLK(  ICT,  IP  IX  ,  IXC*  IXC  DEL ,  64  ,  I YC+- 1 YCDEL ,  64  ) 

CALL  DfILU(*UIN01 .PI*. IER) 

CALL  CFILU(*VIN01 .PI* ,3, 18 , IER) 

CALL  0PEN(1,"UIND1.PI*,3,IER) 

If (  IER . EQ. 1 ) GO  TO  176 

TYPE  **  *  ERROR  OPENINC  UIN01.PI  PILE  *  ** 

176  CALL  URBLK(1,0,IPIX(2),16,1ER) 

CALL  CLOSE ( 1 , IER ) 

CALL  CFILU("N0VASIC1 .A" , IER) 

TYPE 

TYPE  *64X64  IMACE  SUBSECTION  SENT  TO  ECLIPSE* 

Avail  raaulta  of  the  Ecllpaa  analysis  of  the  laage  sub- 
aactloa.  Display  chase  results  to  the  operator. 

178  CALL  SrAT(*NSMITH:COORDPTSl.B*,ISTAT,IER) 

If ( IER . HE . 1 ) GO  TO  178 
TYPE  *< 1 5>< 1 5>* 

TYPE  “ <  7 > IMAGE  SUBSECTION  PROCESSING  COMPLETED" 

CALL  OPEN(  1,*COOROPTS1 .8*, 3, IER) 

If ( IER . EQ. 1 )CO  TO  180 

TYPE  *<15>*  *  ERROR  OPENING  SUBSECTION  PILE  *  *" 

180  READ  BINARY ( 1 )  ( CES TDA TA ( I ) , I- l , 300 ) , ( P A R AM ( I ) ,  I- 1 , 1 6 ) 

CALL  CLOSE ( 1 , IER) 

If (  IER . NE . 1 ) TYPE  "CLOSE  ERROR  ON  SUBSECTION  FILE* 

CALL  Df  ;;.N(*COORDPrSl  .  B"  ,  IER) 

If  (CESTDAi'A(  257  )  .NE.O)CO  TO  184 

TYPE 

TYPE  'PACE  NOT  POUND  IN  THIS  SUBSECTION* 

TYPE 
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% 

% 


i 
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TIPS  “CONTINUING  SEARCH" 

TYPE 

TYPE  "HIT  OCTEK  KEYPAD  #7  TO  DISCONTINUE  THE  SEARCH" 
CALL  80XCUR( ICT, 192, 128) 

CALL  HCTAB( ICT, IXC-96, IYC-64) 

GO  TO  164 
184  TYPE 

TYPE  "PACE  POUND  IN  THE  BOX  CURSOR  SUBSECTION" 


ITOPNDEL-O 

ITOPNOSE-GESTDATA(275) 


C  IP  PACE  LOCATION  WAS  TOO  HIGH  IN  FIRST  IMAGE,  ADJUST  CURSOR  DOWN 

C  AND  RESNAP  IMAGE. 

IP( ITOPNOSE. GE. 43)GO  TO  192 
TYPE  " < 1 5>CENTER ING  IMAGE" 

ITOPNDEL- 43- ITOPNOSE 
IXCDEL-STRIPOATA( 387)- 1 28 
IYCDEL-STRIPDATA( 393) -150- ITOPNDEL 
CALL  HCTAB(  ICT,  IXC+IXCOEL,  IYC+IYCDEL) 

CALL  RVBLK( ICT, IP  IX, I XC+ I XCDEL . 6 4 , I YC+ I YC DE L , 64 ) 

CALL  DPI LW ("HINDI .PI" , IER) 

CALL  CPILW("WIN01 .PI", 3 , 18 , IER) 

CALL  OPEN( 1 , "HINDI .PI", 3 , IER) 

IP ( IER . EQ . 1 ) GO  TO  194 

TYPE  "*  *  ERROR  OPENING  HINDI. PI  PILE  *  *" 

194  CALL  URBLE( 1 ,0, IPIX(2) , 16 , IER) 

CALL  CLOSE ( 1 , IER ) 

192  I TOPNOSE-GESTDATA( 275)4 ITOPNDEL 

IMOUTH-CESTDATA( 276)+ ITOPNDEL 
GESTDATA( 275)- ITOPNOSE 
GESTDATA(276)«IMQUTH 
ILFrEYE-CESTDATA(261) 

IRHTEYE-GESTDATA(262) 

MAXDEL-GESTDATA( 2 7 4 ) -GE STD ATA ( 2 7 3 ) 

ITOPEYE-ITOPNOSE-MAXDEL 
ITOPHEAD- I  TOPE  YE- MAXDE L 
I P ( ITOPHEAD. LT. 1 ) GO  TO  205 
CO  TO  196 
205  TYPE 

TYPE  "FACIAL  IMAGE  IS  TOO  LARGE  TO  SAVE  AS  A  64X64  PICTURE" 

TYPE 

CALL  BOXCUR( ICT, 192 , 128) 

CALL  HCTABdCT, IXC-96, IYC-64) 

CO  TO  164 

196  GESTDATA( 285)- ITOPEYE 

GESTDATA( 286)- ITOPHEAD 

The  next  3  geatdata  valuta  represent  default  values  for  the 
dleenaloaa  of  the  lesge  (64X64)  and  the  PSTOP  uaed  (8.0). 
GESTDATA(45)-64 
GESTDArA(46)-64 
CESTDATA(47)-80 


CALL  0PEN(1,”«IHD1.PI",3,IBR) 

CALL  WRBLK( 1 , 1 6 , GESTDATA ( A 5 ) , 1 , IER) 

CALL  WRBLC(1,17,PARAM,1,I£R) 

CALL  CLOSE ( 1 , IER) 

TYPE 

TYPE  *  IMAGE  AND  FEATURE  LOCATIONS  SAVED  AS  TEST" , IP IXCT 
TYPE 


IFdPIXCT.EQ.  1)CALL  RENAH(  "V I  NO  t .  P  I  “  ,  'TE  ST1  .  P  I"  ,  IER  ) 
IP( IPIXCT.EQ.2)CALL  RENAN ( " W IND1 . P I “ , “ TE ST2 . PI " , IER ) 
IF (  IPIXCT.EQ.3)CALL  RENAN ( “ W I ND1 . P I " , * TEST3 . P I " , IER) 
IF( IPIXCT.EQ.4)CALL  RENAN ( " U IND1 . P I " , * TES T4 . P I " , IER) 
IPIXCT-IPIXCT+1 


Oraw  grid  aarka  on  the  face  at  the  location  It  was  found. 


00  210  I- ILFTEYE,  IBHTEYE 

IELEH" ( ITOPH  EAD- 1 )*64+I> 1 
IPIX( IELEN)-0 
CONTINUE 

DO  220  I- ILFTEYE, IRHTEYE 
IELEM-(  ITOPEYE-1  )*64+-I+l 
IPIX( IELEN)-0 
CONTINUE 

DO  230  I- ILFTEYE, IRHTEYE 

IELEH- (  I  TOP  NOSE -1  )*64+ 1+1 
IPIX( IELEH)-0 
CONTINUE 

DO  240  I- ILFTEYE , IRHTE YE 
IELEN-(  INOUTH -1)*64+ 14-1 
IPIX( IELEN)-0 
CONTINUE 


DELINEATE  THE  CENTER  OF  EYES 
ICTREYE-CESTDATA<260) 

DO  250  1“ ITOPHEAD, INOUTH 

IELEH-ICTREYE+64*( I-l)+l 
IPIX(  IELEH)-0 
CONTINUE 


DELINEATE  THE  LEFT  SIDE  OF  THE  EYES 
DO  260  I- ITOPHEAD, INOUTH 
IELEN- ILFTEYE+64*( I-l)+l 
IPIX( IELEN)-0 
CONTINUE 


DELINEATE  THE  RIGHT  SIDE  OF  THE  EYES 
DO  270  I-ITOPHRAD, INOUTH 
1ELEN-IEHTEYE+64*(I-1)+1 
IPIX(  I  ELEN )“ 0 
CONTINUE 


Dlaplaf  tha  outer  boundarlaa  of  tha  facial  laago. 


& 


IYCDEL“STRIPDATA(  39  3  ) -  1  SO  -  irOPNDEL 
IXCDEL” STR IPDAT A( 38 7 ) -  128 

CALL  UVBLE(  ICT  ,  IP  IX  ,  IXC4- IXCDEL ,  64  ,  I YC  +  I YCDEL  ,  64) 

IF( IPIXCT.GE. 3) GO  TO  170 

TYPE 

TYPE  'CONTINUING  SEARCH* 

TYPE 

TYPE  'HIT  OCTEK  EEYPAO  *7  TO  DISCONTINUE  THE  SEARCH " 
CALL  »OXCUR( ICT, 192, 128) 

CALL  HCTA»(ICT,IXC-96,IYC-64) 

GOTO  164 


CONTINUE 

Taralnata  Ecllpaa  Procaaalng 
CALL  CP ILU ( 'TERHPACE* , 2 , IER ) 

Exit. 

TYPE  *<33>J  Procaaalng  dona.” 

GOTO  999 

CONTINUE 

TYPE  '<33>J  **  Caaara  Input  la  alaaing  (unabla  to  find  vldoo  ay 
GO  TO  40 


CONTINUE 

CALL  ORENOVE  (ICT) 
CALL  RESET 
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A  computer  algorithm  was  developed  which  successfully  locates  and 
identifies  human  face(s)  that  are  present  In  a  digitized  computer  image. 

In  the  process  of  finding  the  facial  image,  the  algorithm  simultaneously 
determines  the  boundary  locations  of  the  sides  of  the  eyes,  the  center  of 
the  face,  the  tip  of  the  nose  and  the  vertical  center  of  the  mouth. 

Detection  of  facial  images  is  based  on  analysis  a  digitized  scene  for 
the  presence  of  characteristic  facial  feature  signatures  for  the  eyes, 
nose  and  mouth.  These  signatures  are  generated  by  the  application  of  a 
"center  of  mass"  calculation  to  each  pixel  row  and  column  for  various 
sub-sections  of  the  digitized  scene.  The  presence  of  a  face  is  confirmed, 
and  its  feature  locations  are  determined  based  on  the  presence  and 
location  of  the  local  maxima  and  minima  which  occur  in  these  curvilinear 
signatures. 

Once  the  face(s)  have  been  located,  individual  face  recognition  is 
performed  by  calculating  the  "gestalt  point"  for  six  different  regions  of 
the  face.  The  gestalt  is  a  location,  in  the  2-d imens  i  ona 1  facial  region, 
which  corresponds  closely  to  the  center  of  mass  of  the  pixel  intensity 
distribution  for  that  region.  Identification  of  the  unknown  face  is 
performed  by  comparing  its  gestalts  with  those  for  known  faces,  using  a 
di s  tance  metric. 

The  algorithm's  face  locator  function  was  tested  on  139  facial  images 
representing  thirty  different  subjects.  The  algorithm  successfully 
located  and  bounded  the  internal  region  of  the  face  in  9 4 Z  of  the  cases. 
Further  tests,  against  a  limited  number  of  arbitrary  backgrounds, 
indicate  that  the  algorithm  is  highly  specific  for  faces. 

The  face  recognition  portion  of  the  algorithm  was  tested  against  a 
database  of  20  different  subjects.  In  two  trial  runs,  using  the  same  20 
person  database,  18  of  the  20  subjects  were  in  the  top  three  candidates 
selected.  In  one  trial  run,  the  algorithm  successfully  identified  the 
unknown  individual  (selected  the  proper  individual  as  the  number  one 
candidate)  60Z  of  the  time.  In  the  other  trial  run  the  proper  candidate 
wa s'  1  de n 1 1  f i ed  50Z  of  the  time.  Recognition  was  based  solely  on  analysis 
of  the  Internal  facial  features. 
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